• unordered_map vs. map

    From Bonita Montero@3:633/280.2 to All on Sat Mar 2 02:03:01 2024
    I had the idea that an unordered_map with copying the contents to a
    vector and sorting afterwards might be faster than constructing a
    similar map<>, which is sorted by nature. So I wrote some code for
    that:

    #include <iostream>
    #include <chrono>
    #include <unordered_map>
    #include <map>
    #include <algorithm>
    #include <vector>

    using namespace std;
    using namespace chrono;

    int main()
    {
    auto tm = []( char const *head, auto fn )
    {
    auto start = high_resolution_clock::now();
    fn();
    cout << head << duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count() / 1.0e9 << endl;
    };
    tm( "unordered->sort: ",
    [&]()
    {
    unordered_map<unsigned, unsigned> unsorted;
    for( unsigned i = 0; i != 1'000'000; ++i )
    unsorted.emplace( i, i );
    vector<pair<unsigned, unsigned>> sorted;
    sorted.reserve( unsorted.size() );
    copy( unsorted.begin(), unsorted.end(), back_inserter( sorted ) );
    sort( sorted.begin(), sorted.end(), []( auto &lhs, auto &rhs ) {
    return lhs.first < rhs.first; } );
    } );
    tm( "ordered: ",
    [&]()
    {
    map<unsigned, unsigned> sorted;
    for( unsigned i = 0; i != 1'000'000; ++i )
    sorted.emplace( i, i );
    } );
    }

    With g++ 12 having a unordered_map and sorting afterwards is about twice
    as fast:

    unordered->sort: 0.0476365
    ordered: 0.0908241

    With MSVC both versions are much slower:

    unordered->sort: 0.204837
    ordered: 0.0862239

    I think with MSVC the memory allocator must be much more conservative
    than libstdc++'s thread-caching allocator.

    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: A noiseless patient Spider (3:633/280.2@fidonet)
  • From Bonita Montero@3:633/280.2 to All on Sat Mar 2 06:44:40 2024
    Am 01.03.2024 um 16:03 schrieb Bonita Montero:
    I had the idea that an unordered_map with copying the contents to a
    vector and sorting afterwards might be faster than constructing a
    similar map<>, which is sorted by nature. So I wrote some code for
    that:

    #include <iostream>
    #include <chrono>
    #include <unordered_map>
    #include <map>
    #include <algorithm>
    #include <vector>

    using namespace std;
    using namespace chrono;

    int main()
    {
    ˙˙˙˙auto tm = []( char const *head, auto fn )
    ˙˙˙˙{
    ˙˙˙˙˙˙˙ auto start = high_resolution_clock::now();
    ˙˙˙˙˙˙˙ fn();
    ˙˙˙˙˙˙˙ cout << head << duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count() / 1.0e9 << endl;
    ˙˙˙˙};
    ˙˙˙˙tm( "unordered->sort: ",
    ˙˙˙˙˙˙˙ [&]()
    ˙˙˙˙˙˙˙ {
    ˙˙˙˙˙˙˙˙˙˙˙ unordered_map<unsigned, unsigned> unsorted;
    unsorted.reserve( 1'000'000 );
    ˙˙˙˙˙˙˙˙˙˙˙ for( unsigned i = 0; i != 1'000'000; ++i )
    ˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙ unsorted.emplace( i, i );
    ˙˙˙˙˙˙˙˙˙˙˙ vector<pair<unsigned, unsigned>> sorted;
    ˙˙˙˙˙˙˙˙˙˙˙ sorted.reserve( unsorted.size() );
    ˙˙˙˙˙˙˙˙˙˙˙ copy( unsorted.begin(), unsorted.end(), back_inserter(
    sorted ) );
    ˙˙˙˙˙˙˙˙˙˙˙ sort( sorted.begin(), sorted.end(), []( auto &lhs, auto
    &rhs ) { return lhs.first < rhs.first; } );
    ˙˙˙˙˙˙˙ } );
    ˙˙˙˙tm( "ordered: ",
    ˙˙˙˙˙˙˙ [&]()
    ˙˙˙˙˙˙˙ {
    ˙˙˙˙˙˙˙˙˙˙˙ map<unsigned, unsigned> sorted;
    ˙˙˙˙˙˙˙˙˙˙˙ for( unsigned i = 0; i != 1'000'000; ++i )
    ˙˙˙˙˙˙˙˙˙˙˙˙˙˙˙ sorted.emplace( i, i );
    ˙˙˙˙˙˙˙ } );
    }

    With g++ 12 having a unordered_map and sorting afterwards is about twice
    as fast:

    ˙˙˙˙unordered->sort: 0.0476365
    ˙˙˙˙ordered: 0.0908241

    With MSVC both versions are much slower:

    ˙˙˙˙unordered->sort: 0.204837
    ˙˙˙˙ordered: 0.0862239

    I think with MSVC the memory allocator must be much more conservative
    than libstdc++'s thread-caching allocator.

    If I reserve a number of buckets according to the inserted notes in
    the unordered_map I get +46% speedup with g++ and +110% on Windows.
    Never throught that rehashing the buckets could be that expensive
    since the number of rehashes is only logarithmic to the number of
    inserted nodes.


    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: A noiseless patient Spider (3:633/280.2@fidonet)
  • From Andrey Tarasevich@3:633/280.2 to All on Sat Mar 2 13:45:02 2024
    On 03/01/24 7:03 AM, Bonita Montero wrote:
    I had the idea that an unordered_map with copying the contents to a
    vector and sorting afterwards might be faster than constructing a
    similar map<>, which is sorted by nature. So I wrote some code for

    Um... This is common knowledge. In general case, if have you all your
    data available to you in advance, it is always faster to use a good
    off-line sorting algorithm than an on-line sorting algorithm.

    The on-line nature of `std::map`-based sorting is the luxury for which
    you pay in extra overhead.

    --
    Best regards,
    Andrey

    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: A noiseless patient Spider (3:633/280.2@fidonet)
  • From Bonita Montero@3:633/280.2 to All on Sat Mar 2 17:35:21 2024
    Am 02.03.2024 um 03:45 schrieb Andrey Tarasevich:
    On 03/01/24 7:03 AM, Bonita Montero wrote:
    I had the idea that an unordered_map with copying the contents to a
    vector and sorting afterwards might be faster than constructing a
    similar map<>, which is sorted by nature. So I wrote some code for

    Um... This is common knowledge. ...

    That MSCVC isn't faster with that ?

    The on-line nature of `std::map`-based sorting is the luxury for which
    you pay in extra overhead.

    Walking the pointer chain in the tree is just slow because it
    is serialized operation with no instruction level parallelism.

    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: A noiseless patient Spider (3:633/280.2@fidonet)