Forum: d0p3 BBS

unordered_map vs. map

From Bonita Montero@3:633/280.2 to All on Sat Mar 2 02:03:01 2024

I had the idea that an unordered_map with copying the contents to a
vector and sorting afterwards might be faster than constructing a
similar map<>, which is sorted by nature. So I wrote some code for
that:

#include <iostream>
#include <chrono>
#include <unordered_map>
#include <map>
#include <algorithm>
#include <vector>

using namespace std;
using namespace chrono;

int main()
{
auto tm = []( char const *head, auto fn )
{
auto start = high_resolution_clock::now();
fn();
cout << head << duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count() / 1.0e9 << endl;
};
tm( "unordered->sort: ",
[&]()
{
unordered_map<unsigned, unsigned> unsorted;
for( unsigned i = 0; i != 1'000'000; ++i )
unsorted.emplace( i, i );
vector<pair<unsigned, unsigned>> sorted;
sorted.reserve( unsorted.size() );
copy( unsorted.begin(), unsorted.end(), back_inserter( sorted ) );
sort( sorted.begin(), sorted.end(), []( auto &lhs, auto &rhs ) {
return lhs.first < rhs.first; } );
} );
tm( "ordered: ",
[&]()
{
map<unsigned, unsigned> sorted;
for( unsigned i = 0; i != 1'000'000; ++i )
sorted.emplace( i, i );
} );
}

With g++ 12 having a unordered_map and sorting afterwards is about twice
as fast:

unordered->sort: 0.0476365
ordered: 0.0908241

With MSVC both versions are much slower:

unordered->sort: 0.204837
ordered: 0.0862239

I think with MSVC the memory allocator must be much more conservative
than libstdc++'s thread-caching allocator.

--- MBSE BBS v1.0.8.4 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Sat Mar 2 06:44:40 2024

Am 01.03.2024 um 16:03 schrieb Bonita Montero:

I had the idea that an unordered_map with copying the contents to a
vector and sorting afterwards might be faster than constructing a
similar map<>, which is sorted by nature. So I wrote some code for
that:

#include <iostream>
#include <chrono>
#include <unordered_map>
#include <map>
#include <algorithm>
#include <vector>

using namespace std;
using namespace chrono;

int main()
{
��auto tm = []( char const *head, auto fn )
��{
�� auto start = high_resolution_clock::now();
�� fn();
�� cout << head << duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count() / 1.0e9 << endl;
��};
��tm( "unordered->sort: ",
�� [&]()
�� {
�� unordered_map<unsigned, unsigned> unsorted;

unsorted.reserve( 1'000'000 );

�� for( unsigned i = 0; i != 1'000'000; ++i )
�� unsorted.emplace( i, i );
�� vector<pair<unsigned, unsigned>> sorted;
�� sorted.reserve( unsorted.size() );
�� copy( unsorted.begin(), unsorted.end(), back_inserter(
sorted ) );
�� sort( sorted.begin(), sorted.end(), []( auto &lhs, auto
&rhs ) { return lhs.first < rhs.first; } );
�� } );
��tm( "ordered: ",
�� [&]()
�� {
�� map<unsigned, unsigned> sorted;
�� for( unsigned i = 0; i != 1'000'000; ++i )
�� sorted.emplace( i, i );
�� } );
}

With g++ 12 having a unordered_map and sorting afterwards is about twice
as fast:

��unordered->sort: 0.0476365
��ordered: 0.0908241

With MSVC both versions are much slower:

��unordered->sort: 0.204837
��ordered: 0.0862239

I think with MSVC the memory allocator must be much more conservative
than libstdc++'s thread-caching allocator.

If I reserve a number of buckets according to the inserted notes in
the unordered_map I get +46% speedup with g++ and +110% on Windows.
Never throught that rehashing the buckets could be that expensive
since the number of rehashes is only logarithmic to the number of
inserted nodes.

--- MBSE BBS v1.0.8.4 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Andrey Tarasevich@3:633/280.2 to All on Sat Mar 2 13:45:02 2024

On 03/01/24 7:03 AM, Bonita Montero wrote:

I had the idea that an unordered_map with copying the contents to a
vector and sorting afterwards might be faster than constructing a
similar map<>, which is sorted by nature. So I wrote some code for

Um... This is common knowledge. In general case, if have you all your
data available to you in advance, it is always faster to use a good
off-line sorting algorithm than an on-line sorting algorithm.

The on-line nature of `std::map`-based sorting is the luxury for which
you pay in extra overhead.

--
Best regards,
Andrey

--- MBSE BBS v1.0.8.4 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

From Bonita Montero@3:633/280.2 to All on Sat Mar 2 17:35:21 2024

Am 02.03.2024 um 03:45 schrieb Andrey Tarasevich:

On 03/01/24 7:03 AM, Bonita Montero wrote:

I had the idea that an unordered_map with copying the contents to a
vector and sorting afterwards might be faster than constructing a
similar map<>, which is sorted by nature. So I wrote some code for

Um... This is common knowledge. ...

That MSCVC isn't faster with that ?

The on-line nature of `std::map`-based sorting is the luxury for which
you pay in extra overhead.

Walking the pointer chain in the tree is just slow because it
is serialized operation with no instruction level parallelism.

--- MBSE BBS v1.0.8.4 (Linux-x86_64)
* Origin: A noiseless patient Spider (3:633/280.2@fidonet)

Who's Online
Recent Visitors
- Guest
  Mon Feb 5 01:31:32 2024
  from Shell via Raw
- bakerycarlos334
  Mon Feb 19 20:49:56 2024
  via HTTP
- pythononlineeditor
  Sun Feb 25 19:51:42 2024
  from UAE,DUBAI via HTTP
- Guest
  Sun Mar 10 02:15:58 2024
  from System via Raw

System Info

Sysop:	Tetrazocine
Location:	Melbourne, VIC, Australia
Users:	6
Nodes:	8 (0 / 8)
Uptime:	27:46:02
Calls:	45
Files:	21,492
Messages:	63,148

unordered_map vs. map

Who's Online

Recent Visitors

System Info