Apache OpenOffice (AOO) Bugzilla – Issue 57961
osl_getAbsoluteFileURL_impl_ calls lstat far too often
Last modified: 2013-07-30 02:20:26 UTC
On startup, Openoffice.org runs lstat on /home more than 400 times.
Created attachment 31529 [details] nuke calls to _osl_resolvepath and rely on higher level file operations generating exceptions on non-existant paths
I agree that these many "lstat" calls during startup are irritating, especially (as you said already) as they are done on the exact same files over and over again. If I understand correctly, most of these "lstat"s are triggered by calling "realpath", which basically iterates over a file path and flattens it by replacing sym. links etc., returning a resolved path. As most client code actually does not care about the resolviness of a path, it is absolutly viable to deal with unresolved pathes. Unfortunately the OSL file API documentation for osl_getAbsolutFileURL states, that the returned path has been resolved. So, changing the implementation according to your patch is "somewhat" incompatible. Therefor I suggest to -A- either create a new function, which does not resolve the path and change all code which does not rely on the resolviness to use this new function, -B- or to change the behavior of osl_getAbsolutFileURL as suggested in your patch, change the documentation accordingly and to provide a new function for actually resolvings file URLs (which may be needed and which would otherwise not be available). -A- is obviously the safe but more work intense solution, where -B- is more straight forward and simple, but also more risky in breaking things. Depending on the time frame you would like to see your patches go in, I suggest to go with -A- for 2.0.x releases or with -B- for 2.x.
Michael, any comments on my comments?
I will pursue option A, but I have yet to schedule time to do so. Is there a preferred name for such a function? osl_no_resolve_path doesn't have a nice ring to it. Additionally, I noticed that the comments tell me: "In rtl/uri there is already an URI parser etc. so this code should be consolidated." Are there any plans to do this?
What about "osl_getFileUrl" ? At least it is simple :-) The comment in "sal/osl/unx/file_url.cxx" is likely from Stephan Bergmann. Please contact him for further info. Michael, I reassign this issue to you, but stay on CC:. If I understood correctly, you volunteered to change the code.
Any news?
one data-point from our build where this was enabled initially: # Don't stat /home a zillion times -- needs some love see iz comments # noelp disabling this because it causes funny problem where # File::getAbsoluteFileURL returns incorrect # result ( which in turn causes java bootstraping problems for regcomp ) so, it causes at least some problems.
IMHO the problem is not that osl_getAbsoluteFileURL resolves all path components. But the question is "Why is osl_getAbsoluteFileURL called so often during startup?" When selecting solution -A- I bet most of the occurences which have to replaced by a newly created function f.e. osl_getFileURLWithoutEllipse must not call such a function at all. Often it's not the implementation of a single function that causes performance problems but that the expensive function is used without need. It's the same as with osl_getDirectoryItem which is quite expensive. I often removed code where the OSL file API was used to list the contents of a directory with osl_openDirectory and osl_getNextDirectoryItem storing just the URL of each file and call osl_getDirectoryItem for each URL.
Reset assignee on issues not touched by assignee in more than 2000 days.