0

I have some old code that was written to support work on a local network, but now may be run by remote workers. Individual file system accesses that were reasonably fast on the local network are intolerably slow for remote access because of all the additional overhead.

For reading files, I have been able to improve speed greatly by reading the entire file into a string as a single access and replacing the StreamReader used to process it with StringReader instead. But searching the file system for possible file paths is a bit trickier.

There is a particular directory (the "master") which contains many subdirectories. Some of these subdirectories (the "models") have a subdirectory ("current") and a file (the "kompdef") in that subdirectory which contains information pertinent to the model. There are many other directories in the master which do not have a kompdef file and thus are not models, and the model directories themselves will have many other directories than "current" (some of which may have komdef files that I do not care about). I have a form with a drop-down that lists the models for the user's selection. Creating the list of models to populate the drop-down is done with the following code:

//KompdefTail = @"current\komdef"  (it's in a Unix file system, so no extention)
List<string> ModelNames = new List<string>();
foreach (DirectoryInfo Model in (new DirectoryInfo(MasterPath)).GetDirectories())  
{
   if ((new FileInfo(Model.FullName + KompdefTail)).Exists) 
   {
      ModelNames.Add(Model.Name);
   }
}

However this requires many file system accesses, which means the form takes around 30 seconds to load up remotely, even though it takes less than 1 second on-site.

Is there some way I can get a list of paths of the form:

  • masterpath\*\kompdeftail

where masterpath and kompdeftail are known, with a minimal number of file system accesses?

3
  • You could Search for files instead of iterating through all the directories. stackoverflow.com/a/9830116/1390548 Commented Mar 27, 2024 at 14:31
  • @Jawad -Tthank you for the suggestion. Unfortunately, because these models have a huge number of subdirectories many of which have a huge number of other files, searching through them all is proving to take far longer than the original method (I gave up waiting for it to finish). In other circumstances, it might be a perfect solution, but because of matters I had not mentioned, it won't work for me. (Even the komdef file itself is actually in a subdirectory of the model directory.) Commented Mar 27, 2024 at 15:45
  • Would it be possible to store a list/database of all these "kompdef"-files for quicker access? It could introduce various consistency problems, but the performance might be worth it. Best would likely be to move everything to a service, but that would likely be a major undertaking. Commented Mar 27, 2024 at 16:02

1 Answer 1

0
  • Don't use FileInfo and DirectoryInfo, just use File and Directory, as these just retrieve the filenames, not any other info.
  • Use File.EnumerateFiles to search using a wildcard and to do recursive searches, this also streams the data in, rather than storing in an array.
  • You don't need to use Exists, if you got a file then it exists.
ModelNames.AddRange(
    Directory.EnumerateFiles(MasterPath, KompdefTail, SearchOption.AllDirectories)
);
Sign up to request clarification or add additional context in comments.

5 Comments

First, my apologies. I had left out some information that I had considered irrelevant to simplify the description. Jawad's comment had already showed me I was mistaken about that, but I failed to edit the question to add it at that time. In particular, the kompdef file is not directly in the model directory, but under a specific subdirectory "current". So I want to search all subdirectories of the master to see if they have a subdirectory called "current' with a file in it called "kompdef". I do not want "kompdef" files under other subdirectories (and there are many of those).
Second, wouldn't using an enumerable increase the number of file system accesses - the big time waster when working remotely I am trying to avoid - as each call to the enumerable would invoke a separate access to the file system?
Either way that access is being made. Network shares use the SMB/CIFS protocol which at best is going to load the structure of each folder. There is no way to change how that works. Your exact situation is not clear to me, maybe you can use Directory.EnumerateDirectories(MasterPath, "current" and then search through that
The problem is that when running remotely, each time it sends a request for information to the file server, it spends a lot of time just establishing the communication (likely setting up a secure channel). This is the vast majority of the time spent. Any changes on my end that do not cut down on the number of separate requests sent will have practically no impact on the run time. I think JonasH's suggestion is going to be my best bet, though it risks the list not being updated when models are added or removed.
SMB/CIFS service should hold open the connection for some time, it doesn't close it immediately. It's just that SMB is very chatty so it may take time over a long latency connection.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.