• Returning small strings as std::array

    From Marcel Mueller@3:633/280.2 to All on Sat Mar 9 21:09:12 2024
    Is it reasonable to return small strings a std::array to avoid copying?

    std::string might require allocation if it has no small string
    optimization build in. Furthermore it cannot be initialized from old C
    style APIs that require char* buffer and size_t buffer_size.

    The idea is to return std::array<char,10> or something like that. This
    causes no allocation and the compiler should be able to optimize the
    return value to emplace the result into the callers storage.

    Any other idea?


    Marcel

    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: MB-NET.NET for Open-News-Network e.V. (3:633/280.2@fidonet)
  • From Bonita Montero@3:633/280.2 to All on Sun Mar 10 02:23:07 2024
    Am 09.03.2024 um 11:09 schrieb Marcel Mueller:

    Is it reasonable to return small strings a std::array to avoid copying? std::string might require allocation if it has no small string
    optimization build in. Furthermore it cannot be initialized from old C
    style APIs that require char* buffer and size_t buffer_size.
    The idea is to return std::array<char,10> or something like that. This causes no allocation and the compiler should be able to optimize the
    return value to emplace the result into the callers storage.
    Any other idea?
    Marcel

    C++ strings can store small strings internally. For MSVC,
    clang / libc++ and g++ / libstdc++ the maximum string length
    that can be stored this way is 16.
    The following program prints the size stored internally with
    your implementation:

    #include <iostream>
    #include <algorithm>

    using namespace std;

    int main()
    {
    size_t length = 1;
    string from, to;
    char const *src, *dst;
    do
    {
    from.resize( 0 );
    fill_n( back_inserter( from ), length++, '*' );
    src = from.data();
    to = move( from );
    dst = to.data();

    } while( src != dst );
    cout << --length << endl;
    }

    Iterators to basic_string objects are allowed to change when you move
    them (other containers doesn't allow this), so also their addresses.
    I simply construct a increasingly large string and look if the star-
    ting address varies on moving. If it is equal you've got an externally
    stored string and the size which can be stored internally is one less.


    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: A noiseless patient Spider (3:633/280.2@fidonet)
  • From Paavo Helde@3:633/280.2 to All on Mon Mar 11 01:35:57 2024
    09.03.2024 12:09 Marcel Mueller kirjutas:
    Is it reasonable to return small strings a std::array to avoid copying?

    No, I believe it's not reasonable as it would complicate the code and
    make it less readable, probably without any measurable benefit whatsoever.

    std::string might require allocation if it has no small string
    optimization build in.

    All mainstream C++ implementations are using small string optimization nowadays.

    Furthermore it cannot be initialized from old C
    style APIs that require char* buffer and size_t buffer_size.

    Cannot really understand this statement. Are you worrying about the
    overhead of initializing 10 bytes in a std::string before calling a C
    style API? Can you actually measure this overhead?


    The idea is to return std::array<char,10> or something like that. This causes no allocation and the compiler should be able to optimize the
    return value to emplace the result into the callers storage.

    Seems like a perfect example of premature optimization. If your
    application really requires to win these hypothetical nanoseconds, then
    you should probably write your own string class tuned to maximum
    performance for that particular application.



    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: A noiseless patient Spider (3:633/280.2@fidonet)
  • From Marcel Mueller@3:633/280.2 to All on Mon Mar 11 03:17:35 2024
    Am 10.03.24 um 15:35 schrieb Paavo Helde:
    std::string might require allocation if it has no small string
    optimization build in.

    All mainstream C++ implementations are using small string optimization nowadays.

    Yes, it seems so.
    Several years ago it was not implemented on platform.

    Furthermore it cannot be initialized from old C style APIs that
    require char* buffer and size_t buffer_size.

    Cannot really understand this statement. Are you worrying about the
    overhead of initializing 10 bytes in a std::string before calling a C
    style API? Can you actually measure this overhead?

    The other way around. The C-API cannot write to a std::string because std::string once created with sufficient size can no longer be converted
    to char*. So I always need a temporary char array to create std::string.
    AFAIR using &str.front() to write the string is not allowed.


    Marcel

    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: MB-NET.NET for Open-News-Network e.V. (3:633/280.2@fidonet)
  • From Paavo Helde@3:633/280.2 to All on Mon Mar 11 03:43:01 2024
    10.03.2024 18:17 Marcel Mueller kirjutas:

    Furthermore it cannot be initialized from old C style APIs that
    require char* buffer and size_t buffer_size.

    Cannot really understand this statement. Are you worrying about the
    overhead of initializing 10 bytes in a std::string before calling a C
    style API? Can you actually measure this overhead?

    The other way around. The C-API cannot write to a std::string because std::string once created with sufficient size can no longer be converted
    to char*. So I always need a temporary char array to create std::string. AFAIR using &str.front() to write the string is not allowed.

    Your information is out of date for 13 years formally, and more than
    that in practice. The C++11 standard added a guarantee that the string internal buffer is contiguous, and it also added a non-const overload of data(), so you can write freely in the string via str.data(). One can
    even write the terminating zero at str.data()[str.length()] (but no
    other value), meaning one can even use strcpy() for writing.


    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: A noiseless patient Spider (3:633/280.2@fidonet)
  • From Andrey Tarasevich@3:633/280.2 to All on Tue Mar 12 16:57:41 2024
    On 03/09/24 2:09 AM, Marcel Mueller wrote:
    std::string might require allocation if it has no small string
    optimization build in.

    True. SSO is not required, but virtually all competent implementations
    do it.

    Furthermore it cannot be initialized from old C
    style APIs that require char* buffer and size_t buffer_size.

    That is not true.

    A non-const version of `std::string::data()` was added in C++17
    specifically to support this usage model. (But even before that you
    could gain non-const access to its internal buffer through `&str[0]`).

    You can pre-`resize()` an `std::string`, pass its `data()` (and `size()`
    ) to a C function, determine the resultant length based on zero
    terminator's location, and then `resize()` it to the new length.

    Done.

    The idea is to return std::array<char,10> or something like that. This causes no allocation and the compiler should be able to optimize the
    return value to emplace the result into the callers storage.

    Is it reasonable to return small strings a std::array to avoid copying?

    If you are sure that they will always fit into the pre-determined (at
    the compile time) size, then perhaps it might be reasonable. Basically,
    the main benefit I can see here is that it allows one to extend SSO-like behavior beyond the boundary used by the existing `std::string` implementation. I.e to avoid involving dynamic memory for a greater
    range of string lengths.

    --
    Best regards,
    Andrey


    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: A noiseless patient Spider (3:633/280.2@fidonet)
  • From Marcel Mueller@3:633/280.2 to All on Fri Mar 15 06:42:54 2024
    Am 12.03.24 um 06:57 schrieb Andrey Tarasevich:
    On 03/09/24 2:09 AM, Marcel Mueller wrote:
    Furthermore it cannot be initialized from old C style APIs that
    require char* buffer and size_t buffer_size.

    That is not true.

    A non-const version of `std::string::data()` was added in C++17
    specifically to support this usage model. (But even before that you
    could gain non-const access to its internal buffer through `&str[0]`).

    Thanks, I was not aware of the latter!

    The application is restricted to C++11. (I forgot to mention.)


    Is it reasonable to return small strings a std::array to avoid copying?

    If you are sure that they will always fit into the pre-determined (at
    the compile time) size, then perhaps it might be reasonable.

    Yes, if the C-API has a size parameter, I am quite sure. ;-)


    Marcel


    --- MBSE BBS v1.0.8.4 (Linux-x86_64)
    * Origin: MB-NET.NET for Open-News-Network e.V. (3:633/280.2@fidonet)