Apache OpenOffice (AOO) Bugzilla – Issue 121585
Replace stlport hash_map with unordered_map
Last modified: 2013-08-07 15:34:48 UTC
Created attachment 80100 [details] [WIP] Shell script to do automated hash --> unordered_map replacements stlport is starting to interfere with boost which is bothersome when we have to update either of the packages but also when one starts to make more use of boost in the system. In general STLport has supports some older SGI functions (vectors, hash) that are deprecated in modern standards and that have equivalents in Boost. Looking at the changes required to get rid of STLport, the most extensive change is replacing the older non-std hashmap with the equivalent unordered_map. The changes are mostly automatic so I made a rough script to make those replacements. This still needs (a lot) more work but it should be a start.
My first instinct was to try the patch in the modules that already depend on boost. According to opengrok: xmlreader o3tl configmgr cppuhelper vcl formula cppu oox sal dbaccess desktop sc It currently breaks cppu though.
Created attachment 80101 [details] Initial patch (breaks in cppu)
Created attachment 80109 [details] Shell script to do automated hash --> unordered_map replacements New revision of the script. This one covers some stuff that was missing to replace in cppu. Additionally I just found that new versions of stlport integrate with boost: From http://stlport.sourceforge.net/ ___ 2005-11-03 STLport 5.0.0 Release * Use of boost type_traits rather than internal equivalent when requested (see _STLP_USE_BOOST_SUPPORT in config file); * unordered_set, unordered_multiset, unordered_map, unordered_multimap hash containers implementation specified in the TR1 document; 2006-03-01 STLport 5.0.2 Release * add --with-boost option for configure; _____ Given that boost 5.1.7 works on all supported platforms, it would be a good candidate for an update.
On second thought, it is probably good to look at alternative hash implementations. Just because we have the one from boost already in the tree doesn't mean it's the best. Google Sparsehash (under a BSD license) looks like an excellent option and it shouldn't be difficult to modify the script to use that instead: http://code.google.com/p/sparsehash/
mmh, replacing stlport with boost is not the option that comes into my mind first. I think it makes more sense to switch to the compiler stl. And the Google one would mean a new external dependency.
(In reply to comment #5) > mmh, replacing stlport with boost is not the option that comes into my mind > first. I think it makes more sense to switch to the compiler stl. And the > Google one would mean a new external dependency. std::unordered_map is a C++11 extension. So if using the compiler implementation, another container has to be chosen. I guess Pedro is trying to follow LO, the replaced stlport containers with boost's unordered ones: http://nabble.documentfoundation.org/LibreOffice-is-free-of-STLport-td2464453.html
ok I see, it's of course not a bad idea to make it like our friends, it will simply the merge of our code in theirs ;-)
Nice link Ariel: I asked "TrainedMonkey" if we could use his patches and he replied NO, but as my script shows the replacements are mostly automatic. The main reason to get rid of stlport is that it doesn't build with clang (therefore no modern MacOS X port), but I am also getting linking/STL conflicts every time I want to use one of the awesome math functions from Boost and it just became impossible to advance further. I found this nice benchmark: http://incise.org/hash-table-benchmarks.html My recommendation is to use boost::unordered_map for now, and eventually bring google sparsehash gradually as it lets us choose between memory consumption and performance. I won't go through this change though, I would prefer if someone with a major supported platform takes over.
OK, I learned something else from boost: http://www.boost.org/doc/libs/1_48_0/doc/html/boost_tr1/usage.html#boost_tr1.usage.native Ww either add #include <boost/tr1/unordered_set.hpp> or (more portable) #include <unordered_set> but if we do the last one we have to include boost-install-path/boost/tr1/tr1 to the compiler include path. I think the second option is better because it makes it easier to change the implementation if we later decide to drop boost and use a native implementation. It also sort of fits the way STLport is being used now. So my script has to be changed to use the second form. This would also make it easier to use other boost components.
Created attachment 80221 [details] patches and script for automated hash_* to boost::unordered conversion The patch contains a modified version of Pedro's automated container conversion from pre-standardized STL's hash_map, hash_set, etc. to their boost::unordered counterparts. It already works well enough to build AOO with GCC in C++11 mode and the application then works. Much more testing is required though not only for the application itself but also when building under different environments. To test the patch please: - checkout a clean trunk (e.g. as of today 2013/02/07) - apply the patch - cd main - source tmp_unordered.sh ... then follow the standard build instructions AFAIK all compilers for our platforms already support what we'd need from C++11, I suggest to switch to each compilers native STL. Or make it configurable to allow e.g. libstdc++/libc++/boost/stlport5/msvc_tr1 containers, etc.
Created attachment 80222 [details] patches and script for automated hash_* to boost::unordered conversion w/o user-defined literal fixups The same patch as above but without the changes that address problems regarding user-defined literals. These whitespace-only changes are now already in trunk.
(In reply to comment #11) > Created attachment 80222 [details] > patches and script for automated hash_* to boost::unordered conversion w/o > user-defined literal fixups > > The same patch as above but without the changes that address problems > regarding user-defined literals. These whitespace-only changes are now > already in trunk. Hi Herbert; First of all, GREAT job! You went ahead and dropped stlport. It looks like you left out the script from the latest patch? Solaris and FreeBSD will be affected. FreeBSD will need the same changes you did in main/solenv/gbuild/platform/linux.mk
One suggestion: As per comment 9 (just scroll up and look at it): It would be better to just add the Boost TR1 path to the build. This would simplify the shell script removing the boost/ references there. TR1 is an an installable addon in MS-Windows and is not available in FreeBSD's libstdc++ so it makes sense to just use the boost TR1 everywhere. We already use it in scaddins for complex number support.
Yes, the tmp_unordered.sh script was unfortunately skipped in the latest patch. It is the same one as in the patches predecessor. Using plain <unordered_*> includes instead of their "boosted" counterparts is preferable indeed. The same applies to direct uses of the boost namespace. If we want flexibility regarding the STL choice I suggest to extend the forwarding available from main/stlport/systemstl* to compiler provided tr1 stuff or to boost/libc++/sparsehash etc. In the first step I just wanted to make sure that we are really using the boost container headers instead of some other stuff loitering around. The build recipes for non-Linux platforms weren't updated yet. Once the tmp_unordered.sh script has settled we should open an extra branch and use it to bring in the other platforms one after the other.