How do I extract a double value from a string using regex.
import re
pattr = re.compile(???)
x = pattr.match("4.5")
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
A regexp from the perldoc perlretut:
import re
re_float = re.compile("""(?x)
^
[+-]? * # first, match an optional sign *and space*
( # then match integers or f.p. mantissas:
d+ # start out with a ...
(
.d* # mantissa of the form a.b or a.
)? # ? takes care of integers of the form a
|.d+ # mantissa of the form .b
)
([eE][+-]?d+)? # finally, optionally match an exponent
$""")
m = re_float.match("4.5")
print m.group(0)
# -> 4.5
To extract numbers from a bigger string:
s = """4.5 abc -4.5 abc - 4.5 abc + .1e10 abc . abc 1.01e-2 abc
1.01e-.2 abc 123 abc .123"""
print re.findall(r"[+-]? *(?:d+(?:.d*)?|.d+)(?:[eE][+-]?d+)?", s)
# -> ['4.5', '-4.5', '- 4.5', '+ .1e10', ' 1.01e-2',
# ' 1.01', '-.2', ' 123', ' .123']
Method 2
Here’s the easy way. Don’t use regex’s for built-in types.
try:
x = float( someString )
except ValueError, e:
# someString was NOT floating-point, what now?
Method 3
For parse int and float (point separator) values:
re.findall( r'd+.*d*', 'some 12 12.3 0 any text 0.8' )
result:
['12', '12.3', '0', '0.8']
Method 4
a float as regular expression in brute force. there are smaller differences to the version of J.F. Sebastian:
import re if __name__ == '__main__': x = str(1.000e-123) reFloat = r'(^[+-]?d+(?:.d+)?(?:[eE][+-]d+)?$)' print re.match(reFloat,x) >>> <_sre.SRE_Match object at 0x0054D3E0>
Method 5
Just to note that none of these answers cover the interesting edge cases such as “inf”, “NaN”, “-iNf”, “-NaN”, “1e-1_2_3_4_5_6”, etc.
(inspired by Eric’s answer here Checking if a string can be converted to float in Python)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0