Complete this regex to capture src path for HTML image tags
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I'm building a regular expression to capture the src path of images in HTML tags.

See: http://regexr.com/3e9bv

It needs to accommodate for single or double quotes, as well as variability in terms of the order of attributes in the image tag.
Also needs to ignore other elements that have src's (like iframes, script, etc.).

The provided example is like 80% there, just stuck on getting it 100%.

awarded to kostasx
Tags
regex
regexp

Crowdsource coding tasks.

3 Solutions

Winning solution

Hmmm... this seems to do the job on the test page:

img(?:.*=['|"].*['|"])? src=['|"]([^/|^http?s].*?)['|"]

Test: https://regex101.com/r/eW9kT3/1
Test: http://regexr.com/3e9cb

It's capturing too much. For example, for: <img class="center" src="kitten.jpeg" style="width:200px"> It should just capture kitten.jpeg But it's capturing img class="center" src="kitten.jpeg"
Difranco 3 years ago
Paste the test text in the textarea and press the REGEX button. You should be OK. https://jsfiddle.net/m39uLj4p/
kostasx 3 years ago
I see now where I was mistaken - I was confusing "matches" and "captures". Got it working in my application (https://gist.github.com/anonymous/1ac153e1fcf780e00a99d83a1da9bd0b). Thanks for the solution.
Difranco 3 years ago
Exactly. Thanks! ;)
kostasx 3 years ago

Here's mine:

\<img.+src=[\"|\'](?!https?:\/\/)([^\/].+?)[\"|\']

https://regex101.com/r/fP6qG0/1


Please don't use a regex, use JSDOM, an HTML parser, or whatever... just don't try to process a structured language with as loose syntax as HTML via regex.

View Timeline