• Re: Tutorial: Notepad++ shortcuts.xml macro converts unicode to the 95-

    From Herbert Kleebauer@3:633/10 to All on Wed Dec 31 13:21:16 2025
    Subject: Re: Tutorial: Notepad++ shortcuts.xml macro converts unicode to the 95-keyboard ASCII characters

    On 12/31/2025 9:33 AM, Marian wrote:

    This line has a sneaky Unicode dash ? right here.
    This line has curly quotes ?like these?.
    This line has a non-breaking space between words.

    In Thunderbird this didn't arrive as valid uTF-8 code.

    "dash ? right" in hex:

    64 61 73 68 ³ 20 FB 20 72 ³ 69 67 68 74

    FB is the starting byte of a 4 byte utf-8 code, but the
    3 remaining bytes are missing.


    type unicode2ascii.bat
    @echo off
    :: unicode2ascii.bat
    :: This batch file runs a PowerShell script that removes all non-ASCII
    :: characters from unicode.txt and writes the cleaned output to ascii.txt. powershell -NoProfile -ExecutionPolicy Bypass -File unicode2ascii.ps1

    Wouldn't it be simpler to open the file in Notepad and save it with
    ANSI encoding instead of UTF-8?

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Herbert Kleebauer@3:633/10 to All on Wed Dec 31 20:47:38 2025
    Subject: Re: Tutorial: Notepad++ shortcuts.xml macro converts unicode to the 95-keyboard ASCII characters

    On 12/31/2025 7:21 PM, Marian wrote:

    Bear in mind there is much more than just Unicode characters in
    pasted web-page text as Unicode is only the container; the real trouble
    comes from the variety of characters inside it such as zero-width
    spaces & joiners, directional control characters, soft hyphens, etc.

    I don't understand the problem. In these days, (nearly) any web page
    and usnet posting uses utf-8 character encoding. Also your posting
    uses utf-8;

    Content-Type: text/plain; charset=UTF-8; format=flowed
    User-Agent: tin/1.6.2-20030910 ("Pabbay") (UNIX) (CYGWIN_NT-10.0-WOW/2.8.0(0.309/5/3) (i686)) Hamster/2.0.2.2


    There shouldn't be any problem when you copy text from a web page and
    past it into an usnet posting. There is only a problem if you
    convert the utf-8 text into "something else" and then paste it
    into the posting. You don't solve a problem, you create a problem.


    Just post a link to a web page which has a problem with copy&paste
    text into an usnet posting.





    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Chris@3:633/10 to All on Fri Jan 2 11:44:02 2026
    Subject: Re: Tutorial: Notepad++ shortcuts.xml macro converts unicode to the 95-keyboard ASCII characters

    Herbert Kleebauer <klee@unibwm.de> wrote:
    On 12/31/2025 7:21 PM, Marian wrote:

    Bear in mind there is much more than just Unicode characters in
    pasted web-page text as Unicode is only the container; the real trouble
    comes from the variety of characters inside it such as zero-width
    spaces & joiners, directional control characters, soft hyphens, etc.

    I don't understand the problem. In these days, (nearly) any web page
    and usnet posting uses utf-8 character encoding. Also your posting
    uses utf-8;

    Content-Type: text/plain; charset=UTF-8; format=flowed
    User-Agent: tin/1.6.2-20030910 ("Pabbay") (UNIX) (CYGWIN_NT-10.0-WOW/2.8.0(0.309/5/3) (i686)) Hamster/2.0.2.2


    There shouldn't be any problem when you copy text from a web page and
    past it into an usnet posting. There is only a problem if you
    convert the utf-8 text into "something else" and then paste it
    into the posting. You don't solve a problem, you create a problem.

    100% accurate. This is Donald's MO.



    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Frank Slootweg@3:633/10 to All on Sat Jan 3 13:07:46 2026
    Subject: Re: Tutorial: Notepad++ shortcuts.xml macro converts unicode to the 95-keyboard ASCII characters

    Herbert Kleebauer <klee@unibwm.de> wrote:
    On 12/31/2025 7:21 PM, Marian wrote:

    Bear in mind there is much more than just Unicode characters in
    pasted web-page text as Unicode is only the container; the real trouble comes from the variety of characters inside it such as zero-width
    spaces & joiners, directional control characters, soft hyphens, etc.

    I don't understand the problem. In these days, (nearly) any web page
    and usnet posting uses utf-8 character encoding. Also your posting
    uses utf-8;

    Content-Type: text/plain; charset=UTF-8; format=flowed
    User-Agent: tin/1.6.2-20030910 ("Pabbay") (UNIX) (CYGWIN_NT-10.0-WOW/2.8.0(0.309/5/3) (i686)) Hamster/2.0.2.2

    I don't know whether or not he used UTF-8, but, as with all his
    postings, these headers are fabricated/bogus.

    For example the 'User-Agent:' header is stolen from my postings
    (extremely unlikely that anyone else has that exact combination of newsreader/local server). And I normally don't use any 'Content-Type:'
    header and if I do, it's not UTF-8, nor format=flowed, so that header is fabricated/ bogus as well.

    He claims you're partly responsible for him generating this crap :-) :

    Message-ID: <10j7auc$mji$1@nnrp.usenet.blueworldhosting.com>

    There shouldn't be any problem when you copy text from a web page and
    past it into an usnet posting. There is only a problem if you
    convert the utf-8 text into "something else" and then paste it
    into the posting. You don't solve a problem, you create a problem.

    Creating problems is his purpose in life.

    Just post a link to a web page which has a problem with copy&paste
    text into an usnet posting.

    Don't hold your breath.

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)