'C# ODBC driver with DBF database in parallel not working as intended
Hi everyone this is my last resort, i think I'm getting something wrong but can't get unstuck
I was trying to load some old DBF files at startup to create my data context in the new application but soon discovered that some of these really take up a long time to load at times even 1-2 minutes (these are sometimes 80-100MB databases) which is not acceptable at the start of the application. My idea was to load them in parallel which would mean at worst i have a load time of 1-2 minutes(the bigger one) but when using a stopwatch to check the execution time i wasn't getting the 1-2 minutes expected but i instead got the sum of the time as if i was doing them one by one.
This is the code i wrote to access all the databases in the folder using an obdc adapter, in reality, the query executes very fast is the adapter that is taking a long time to load things in Datatable, i switched to Task after not getting any result with "Parallel.Foreach()" i even tried to switch on and off the background Fetch but with no avail, is there something i can try or this driver isn't made to be used by more resources?
public static List<DataTable> SelectAllParallel(string folder)
{
System.Data.Odbc.OdbcConnection conn = new System.Data.Odbc.OdbcConnection("Driver={Microsoft Visual FoxPro Driver};SourceType=DBF;SourceDB=" + folder + ";Exclusive=No;Collate=Machine;NULL=NO;DELETED=NO;BACKGROUNDFETCH=Yes;");
List<string> Databases = new List<string>();
foreach (var file in Directory.EnumerateFiles(folder, "*.dbf"))
Databases.Add(Path.GetFileName(file));
conn.Open();
List<OdbcDataReader> QueryResult = new List<OdbcDataReader>();
List<DataTable> Results = new List<DataTable>();
var watch = new System.Diagnostics.Stopwatch();
watch.Start();
foreach (string database in Databases)
{
string strQuery = $"SELECT * FROM [{database}]";
OdbcCommand command = new OdbcCommand(strQuery, conn);
QueryResult.Add(command.ExecuteReader());
}
List<Task> tsk = new List<Task>();
foreach(OdbcDataReader SingQuery in QueryResult)
{
tsk.Add(new Task(() => { DataTable dt = new DataTable();dt.Load(SingQuery); Results.Add(dt); }));
}
foreach (var tssk in tsk)
tssk.Start();
Task.WaitAll(tsk.ToArray());
watch.Stop();
conn.Close();
var h = watch.Elapsed;
return Results;
}
Solution 1:[1]
You are using background fetching and multiple threads which is exactly what Microsoft warns you not to do:
https://docs.microsoft.com/bs-latn-ba/sql/odbc/microsoft/thread-support-visual-foxpro-odbc-driver
Avoid using background fetch when you call the driver from multithreaded applications.
The VFP ODBC driver is thread-safe. That means, you can call it from multiple threads (if you avoid background fetching). However, it is not multi-threaded. Only one query at a time is executed. The others are blocked by a semaphore that VFP is using to synchronize access to its query engine.
Loading all rows into memory with SELECT * FROM table is usually considered a bad approach. You get better performance if you only load the records you need. The ODBC driver isn't designed and optimized to return all rows.
If your specific case needs access to all rows in all tables, for example, because your application is a converter that converts all data to a different format, you might get better performance by using a C# DBF library. There could be compatibility issues, though, because most of these libraries only implement older DBF file formats. You also should avoid writing with these libraries, because that could lead to corruption.
If you are only reading the files and not changing them later, and if your files are on a network share, you will also see better performance if you copy the entire file into a local temp folder and then use the ODBC driver to read from the temp folder.
The reason for this is that VFP reads records synchronously. It reads one record from the file server and waits until the record has been served to the client before reading the next record. This sends many IP packets across the network (up to one per record). Network latency and packet trough-put rather than bandwidth determine transfer speed in this case. With local files you don't have these limitations.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Christof Wollenhaupt |