if not re.search("\\sout\{", line):
"Michael F. Stemper" <michael.stemper@gmail.com> wrote or quoted:
if not re.search("\\sout\{", line):
So, if you're not down to slap an "r" before your string literals,
you're going to end up doubling down on every backslash.
Long story short, those double backslashes in your regex?
They'll be quadrupling up in your Python string literal!
for line in lines:
product = re.search( "\\\\sout\\{", line )
For now, I'll use the "r" in a cargo-cult fashion, until I decide which >syntax I prefer. (Is there any reason that one or the other is preferable?)
"Michael F. Stemper" <michael.stemper@gmail.com> wrote or quoted:
For now, I'll use the "r" in a cargo-cult fashion, until I decide which >>syntax I prefer. (Is there any reason that one or the other is preferable?)
I'd totally go with the r-style notation!
It's got one bummer though - you can't end such a string literal with
a backslash. But hey, no biggie, you could use one of those notations:
main.py
path = r'C:\Windows\example' + '\\'
print( path )
path = r'''
C:\Windows\example\
'''.strip()
print( path )
stdout
C:\Windows\example\
C:\Windows\example\
.
"Michael F. Stemper" <michael.stemper@gmail.com> wrote or quoted:
path = r'C:\Windows\example' + '\\'
I'm trying to discard lines that include the string "\sout{" (which is T=eX, for
those who are curious. I have tried:
if not re.search("\sout{", line):
if not re.search("\sout\{", line):
if not re.search("\\sout{", line):
if not re.search("\\sout\{", line):
Am Mon, Oct 07, 2024 at 08:35:32AM -0500 schrieb Michael F. Stemper via Python-list:
I'm trying to discard lines that include the string "\sout{" (which is TeX, for
those who are curious. I have tried:
if not re.search("\sout{", line):
if not re.search("\sout\{", line):
if not re.search("\\sout{", line):
if not re.search("\\sout\{", line):
unwanted_tex = '\sout{'
if unwanted_tex not in line: do_something_with_libreoffice()
I'm trying to discard lines that include the string "\sout{" (which is TeX, for
those who are curious. I have tried:
if not re.search("\sout{", line):
if not re.search("\sout\{", line):
if not re.search("\\sout{", line):
if not re.search("\\sout\{", line):
But the lines with that string keep coming through. What is the right syntax to
properly escape the backslash and the left curly bracket?
However, regex also uses backslash as an escape character.
"\\\\chardef \\\\\\\\ = '\\\\\\\\".
unwanted_tex =3D '\sout{'
if unwanted_tex not in line: do_something_with_libreoffice()
That should be:
unwanted_tex =3D r'\sout{'
'\\sout{'tex =3D '\sout{'
tex
Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:
Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> tex = '\sout{'
>>> tex
'\\sout{'
>>>
Am I missing something ?
You're missing the warning it generates:
> python -E -Wonce
Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> tex = '\sout{'
<stdin>:1: DeprecationWarning: invalid escape sequence '\s'
>>>
Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:inux
Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] on l=
Type "help", "copyright", "credits" or "license" for more inf=ormation.
>>> tex =3D '\sout{'
>>> tex
'\\sout{'
>>>
Am I missing something ?
You're missing the warning it generates:
<stdin>:1: DeprecationWarning: invalid escape sequence '\s'
I'm trying to discard lines that include the string "\sout{" (which is
TeX, for those who are curious. I have tried:
if not re.search("\sout{", line): if not re.search("\sout\{", line):
if not re.search("\\sout{", line): if not re.search("\\sout\{",
line):
But the lines with that string keep coming through. What is the right
syntax to properly escape the backslash and the left curly bracket?
<re.Match object; span=(8, 14), match='\\sout{'>import re
s = r"testing \sout{WHADDEVVA}"
re.search(r"\\sout{", s)
<re.Match object; span=(8, 14), match='\\sout{'>re.search("\\\\sout{", s)
I'm trying to discard lines that include the string "\sout{" (which is
TeX, for those who are curious. I have tried:
if not re.search("\sout{", line): if not re.search("\sout\{", line):
if not re.search("\\sout{", line): if not re.search("\\sout\{",
line):
But the lines with that string keep coming through. What is the right
syntax to properly escape the backslash and the left curly bracket?
<re.Match object; span=(8, 14), match='\\sout{'>import re
s = r"testing \sout{WHADDEVVA}"
re.search(r"\\sout{", s)
<re.Match object; span=(8, 14), match='\\sout{'>re.search("\\\\sout{", s)
Is there some utility function out there that can be called to show what the regular expression you typed in will look like by the time it is ready to be used?
Obviously, life is not that simple as it can go through multiple layers with each dealing with a layer of backslashes.
But for simple cases, ...
-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Gilmeh Serda via Python-list
Sent: Friday, October 11, 2024 10:44 AM
To: python-list@python.org
Subject: Re: Correct syntax for pathological re.search()
On Mon, 7 Oct 2024 08:35:32 -0500, Michael F. Stemper wrote:
I'm trying to discard lines that include the string "\sout{" (which is
TeX, for those who are curious. I have tried:
if not re.search("\sout{", line): if not re.search("\sout\{", line):
if not re.search("\\sout{", line): if not re.search("\\sout\{",
line):
But the lines with that string keep coming through. What is the right
syntax to properly escape the backslash and the left curly bracket?
$ python
Python 3.12.6 (main, Sep 8 2024, 13:18:56) [GCC 14.2.1 20240805] on linux Type "help", "copyright", "credits" or "license" for more information.
<re.Match object; span=(8, 14), match='\\sout{'>import re
s = r"testing \sout{WHADDEVVA}"
re.search(r"\\sout{", s)
You want a literal backslash, hence, you need to escape everything.
It is not enough to escape the "\s" as "\\s", because that only takes care
of Python's demands for escaping "\". You also need to escape the "\" for
the RegEx as well, or it will read it like it means "\s", which is the
RegEx for a space character and therefore your search doesn't match,
because it reads it like you want to search for " out{".
Therefore, you need to escape it either as per my example, or by using
four "\" and no "r" in front of the first quote, which also works:
<re.Match object; span=(8, 14), match='\\sout{'>re.search("\\\\sout{", s)
You don't need to escape the curly braces. We call them "seagull wings"
where I live.
Is there some utility function out there that can be called to show what =the
regular expression you typed in will look like by the time it is ready to=be
used?
But - without having looked at the implementation - it's far from clear
that the compiled form would be useful to the user.
On 2024-10-11 17:13:07 -0400, AVI GROSS via Python-list wrote:
Is there some utility function out there that can be called to show what the >> regular expression you typed in will look like by the time it is ready to be >> used?
I assume that by "ready to be used" you mean the compiled form?
No, there doesn't seem to be a way to dump that. You can
p = re.compile("\\\\sout{")
print(p.pattern)
but that just prints the input string, which you could do without
compiling it first.
But - without having looked at the implementation - it's far from clear
that the compiled form would be useful to the user. It's probably some
kind of state machine, and a large table of state transitions isn't very readable.
There are a number of websites which visualize regular expressions.
Those are probably better for debugging a regular expression than
anything the re module could reasonably produce (although with the
caveat that such a web site would use a different implementation and therefore might produce different results).
hp
Is there some utility function out there that can be called to show whatthe
regular expression you typed in will look like by the time it is ready tobe
used?
On 2024-10-11 22:13, AVI GROSS via Python-list wrote:
Is there some utility function out there that can be called to showYes. It's called 'print'. :-)
what the
regular expression you typed in will look like by the time it is ready
to be
used?
Obviously, life is not that simple as it can go through multiple
layers with
each dealing with a layer of backslashes.
But for simple cases, ...
\w+\\subimport re
re_string = '\\w+\\\\sub'
re_pattern = re.compile(re_string)
# Should look as if we had used r'\w+\\sub'
print(re_pattern.pattern)
-----Original Message-----
From: Python-list <python-list-
bounces+avi.e.gross=gmail.com@python.org> On
Behalf Of Gilmeh Serda via Python-list
Sent: Friday, October 11, 2024 10:44 AM
To: python-list@python.org
Subject: Re: Correct syntax for pathological re.search()
On Mon, 7 Oct 2024 08:35:32 -0500, Michael F. Stemper wrote:
I'm trying to discard lines that include the string "\sout{" (which is
TeX, for those who are curious. I have tried:
ÿÿ if not re.search("\sout{", line): if not re.search("\sout\{", line):
ÿÿ if not re.search("\\sout{", line): if not re.search("\\sout\{",
ÿÿ line):
But the lines with that string keep coming through. What is the right
syntax to properly escape the backslash and the left curly bracket?
$ python
Python 3.12.6 (main, Sepÿ 8 2024, 13:18:56) [GCC 14.2.1 20240805] on
linux
Type "help", "copyright", "credits" or "license" for more information.
<re.Match object; span=(8, 14), match='\\sout{'>import re
s = r"testing \sout{WHADDEVVA}"
re.search(r"\\sout{", s)
You want a literal backslash, hence, you need to escape everything.
It is not enough to escape the "\s" as "\\s", because that only takes
care
of Python's demands for escaping "\". You also need to escape the "\" for
the RegEx as well, or it will read it like it means "\s", which is the
RegEx for a space character and therefore your search doesn't match,
because it reads it like you want to search for " out{".
Therefore, you need to escape it either as per my example, or by using
four "\" and no "r" in front of the first quote, which also works:
<re.Match object; span=(8, 14), match='\\sout{'>re.search("\\\\sout{", s)
You don't need to escape the curly braces. We call them "seagull wings"
where I live.
You don't need to escape the curly braces.
On 10/12/2024 6:59 AM, Peter J. Holzer via Python-list wrote:hat the
On 2024-10-11 17:13:07 -0400, AVI GROSS via Python-list wrote:
Is there some utility function out there that can be called to show w=
y to beregular expression you typed in will look like by the time it is read=
=20used?=20
I assume that by "ready to be used" you mean the compiled form?
=20
No, there doesn't seem to be a way to dump that. You can
=20
p =3D re.compile("\\\\sout{")
print(p.pattern)
=20
but that just prints the input string, which you could do without
compiling it first.
It prints the escaped version,
so you can see if you escaped the string as you intended. In this
case, the print will display '\\sout{'.
As a trivial example, the regular expressions r"\\sout{" and r"\\sout\{"
are equivalent (the \ before the { is redundant). Yet
re.compile(s).pattern preserves the difference between the two strings.
Peter J. Holzer ha scritto:
As a trivial example, the regular expressions r"\\sout{" and r"\\sout\{" are equivalent (the \ before the { is redundant). Yet=20
re.compile(s).pattern preserves the difference between the two strings.
Allow me to be fussy: r"\\sout{" and r"\\sout\{" are similar but not equivalent.
If you omit the backslash, the parser will have to determine if the
graph is part of regular expression {n, m} and will take more time.
On 2024-10-19 00:15:23 +0200, jak via Python-list wrote:.. . .
Allow me to be fussy: r"\\sout{" and r"\\sout\{" are similar but not >>equivalent.
Yes, that's the parser. But the result of parsing will be the same:
The string will end in a literal backslash.
Sysop: | Tetrazocine |
---|---|
Location: | Melbourne, VIC, Australia |
Users: | 4 |
Nodes: | 8 (0 / 8) |
Uptime: | 214:58:04 |
Calls: | 73 |
Calls today: | 1 |
Files: | 21,500 |
Messages: | 73,905 |