javascript/regex to populate fields based on url - fancy
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

We need some javascript that will run in IE (10+ is fine), Chrome, FF, and on mobiles that will grab the url from the address bar and populate some fields. We have a few different transforms that need to happen. Let me show you some examples

Example URL:
https://www.yoursite.com/pricing?utm_source=active%20users&utm_medium=email&utm_campaign=feature%20launch&utm_content=bottom%20cta%20button

Next we would have a form field where we populate all of the values you see on the right. these are all of the transforms we want to be able to assemble on page load. Take note of the idea around adding some of these things together using the +

URL-full = https://www.yoursite.com/pricing?utm_source=active%20users&utm_medium=email&utm_campaign=feature%20launch&utm_content=bottom%20cta%20button

URL-base = www.yoursite.com

URL-protocol = https

URL-param-utm_source = active%20users

URL-param-utm_source+utm_medium = active%20users,email

URL-protocol+base = https,www.yoursite.com

ideally for each of the above examples we would have a field on the page with an id that equals the transform we need. for example
<input type="text" class="form-control" id="URL-base" placeholder="base url">

Couple of notes:

Would be great if we don't have dependency on jquery
Lets make sure # in the url doesnt break things and that if we have a # in the url the URL-full still works as expected

20 days ago
Tags
javascript

Crowdsource coding tasks.

4 Solutions


<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8">
    <title>title</title>
  </head>
  <body>
    <input type="text" class="form-control" id="URL-full" placeholder="base url">
    <input type="text" class="form-control" id="URL-base" placeholder="base url">
    <input type="text" class="form-control" id="URL-protocol" placeholder="base url">
    <input type="text" class="form-control" id="URL-param-utm_source" placeholder="base url">
    <input type="text" class="form-control" id="URL-param-utm_source_utm_medium" placeholder="base url">
    <input type="text" class="form-control" id="URL-protocol_base" placeholder="base url">
    <script src="./index.js"></script>
  </body>
</html>


function getUrlVars() {
    var vars = {};
    var parts = window.location.href.replace(/[?&]+([^=&]+)=([^&]*)/gi, function(m,key,value) {
    vars[key] = value;
});
return vars;
}

var full = document.getElementById('URL-full')
var base = document.getElementById('URL-base')
var protocol = document.getElementById('URL-protocol')
var paramUtmSource = document.getElementById('URL-param-utm_source')
var paramUtmSourceUtmMedium = document.getElementById('URL-param-utm_source_utm_medium')
var protocolBase = document.getElementById('URL-protocol_base')

full.value = document.URL
base.value = document.location.host
protocol.value = document.location.protocol.slice(0, -1)
paramUtmSource.value = getUrlVars()['utm_source'] || ''
paramUtmSourceUtmMedium.value = (getUrlVars()['utm_source'] || '') + ',' + (getUrlVars()['utm_medium'] || '')
protocolBase.value = protocol.value + ',' + base.value

Sorry I had glaring mistake at first. Javascript and HTML included above.

hey thanks for the submission.
While we review do you think you could make it handle undefined operations, either where it cant find a value or stuff like [object HTMLInputElement],[object HTMLInputElement] . in the case something is chained/combined like param+param if the first param is not found i think its totally valid to just put the comma and then the second. this way when parsing the data we know to skip the empty set.
Qdev 27 days ago
I can, but how exactly would you like undefined handled? That could come up if the URL does not have correct parameters.
jduplessis294 27 days ago
ok so if we dont get a deterministic outcome we leave it empty. for example for this one undefined,email this would just render as ,email or if it was email,undefined it would be email,
Qdev 27 days ago

Here's my solution http://jsfiddle.net/farolan/z9aL47yr/

Features

- populate input fields according to their IDs
- supports custom combination of query params
- handles undefined param

Edit 1

  • added feature to automatically populate input fields according to their IDs
    • input with id URL-param-utm_source will be filled with utm_source query param
    • input with id URL-param-utm_source__utm_medium will be filled with utm_source and utm_medium query params joined by comma

Here's my version for the solution.

JSFiddle preview

index.html

main.js

main.css

Description

A very dynamic aproach to the solution. Necessary output is defined as a simple schema, e.g.:

const extractionSchema = [
    "base",
    "full",
    ["param", ["utm_source", "utm_campaign"]]
];

This will extract all the necessary parts from the provided URL based on the said schema. The schemas are dynamic, so part extractions like this are possible, e.g.:

const extractionSchema = [
    "protocol",
    ["base", "full", "param", ["utm_source", "utm_campaign"], "protocol"],
    "port"
];

Features

Supports part extraction from default and matrix-parameter URLs (where parameters are split by ;)

Creates and appends the necessary input elements dynamically.

Should work great on old and new browsers and devices (tested and working on an old iPhone 3Gs and IE5).

Easily extendible with new parameters - string or dictionary.

Custom placeholder texts for each possible URL part.

Available Extraction Parts

full: The full, untouched URL.

base: Hostname of the URL.

protocol: Protocol of the URL.

port: Port of the URL.

path: Path of the URL (e.g. /index.html).

hash: Hash of the URL.

param: Query parameters of the URL.

Notes

As CyteBode has noted already, + is not a valid id character, so with my default settings (easily changeable) labels and id's are generated as such:

var extractionSchema = [
    "base",
    ["base", "protocol"],
    ["param", ["p1", "p2"], "base"]
];

becomes

URL-base

URL-base-protocol

URL-param:p1__p2-base

EDIT: Minor fixes to expand browser compatibility, now works even on IE5 (as low as I could test).

Nice man thanks for this, will test it out in the morning.
Qdev 26 days ago

index.html

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8">
    <title>URL parser</title>
    <link href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.3/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-MCw98/SFnGE8fJT3GXwEOngsV7Zt27NXFoaoApmYm81iuXoPkFOJwJ8ERdknLPMO" crossorigin="anonymous">
    <style>
      .form {
        max-width: 800px;
        margin: auto;
      }

      label {
        font-weight: bold;
      }
    </style>
    <script type="text/javascript" src="script.js"></script>
  </head>
  <body>
    <form class="form">
      <!-- Will be populated by JS-->
    </form>
  </body>
</html>

script.js

function processTransform(transform, hierarchy) {
  var stack = [];
  var output = [];

  var extractions = transform.split("+");
  for (var i in extractions) {
    var path = extractions[i].split("-");
    if (typeof hierarchy[path[0]] === "undefined") {
      var tempStack = stack.slice();
      var tempHier = tempStack.pop();
      while(typeof tempHier !== "undefined" &&
            typeof tempHier[path[0]] === "undefined") {
        tempHier = tempStack.pop();
        if (typeof tempHier === "undefined") {
          break;
        }
      }
      if (typeof tempHier !== "undefined") {
        hierarchy = tempHier;
        stack = tempStack;
      }
    }
    var j = 0;
    if (path.length > 1) {
      stack.push(hierarchy);
      for (; j < path.length - 1; j++) {
        hierarchy = hierarchy[path[j]];
        if (typeof hierarchy === "undefined") {
          throw "Invalid transform";
        }
      }
    }
    output.push(hierarchy[path[j]])
  }

  return output.join(",");
}


function parseURL(url, decode) {
  if (typeof url === "undefined") {
    url = document.URL;
  }
  url = url.split("#")[0];
  var regex = /[?&]([^=#&]+)=([^&#]*)/g, params = {}, match;
  while (match = regex.exec(url)) {
    if (decode) {
      params[decodeURIComponent(match[1])] = decodeURIComponent(match[2]);
    } else {
      params[match[1]] = match[2];
    }
  }
  return {
    full: url,
    base: url.split("/")[2],
    protocol: url.split(":")[0],
    param: params
  };
}


function processURLTransform(transform, parsedURL) {
  return processTransform(transform.replace(/^URL-/, ""), parsedURL);
}


window.onload = function() {
  var parsedURL = parseURL("https://www.yoursite.com/pricing?" +
                           "utm_source=active%20users&" +
                           "utm_medium=email&" +
                           "utm_campaign=feature%20launch&" +
                           "utm_content=bottom%20cta%20button#hash");
  //var parsedURL = parseURL();

  function constructTransform(tops, bottoms) {
    /// Utility function for generating a human-readable placeholder and the
    /// transform string from a more concise representation.
    if (tops === "param") {
      if (typeof bottoms === "string") {
        bottoms = [bottoms];
      }
      if (bottoms.length === 0) {
        return ["No query", ""];
      }
      var placeholder = ("URL Parameter" + (bottoms.length === 1 ? "" : "s") +
                         " " + bottoms.join(" & "));
      var query = "URL-param-" + bottoms.join("+");
      return [placeholder, query];
    } else {
      if (typeof tops === "string") {
        tops = [tops];
      }
      var placeholders = [];
      var transforms = [];
      for (var i in tops) {
        switch (tops[i]) {
          case "full": placeholders.push("Full URL"); break;
          case "base": placeholders.push("Base URL"); break;
          case "protocol": placeholders.push("URL Protocol"); break;
          default: throw "Unsupported name: " + tops[i];
        }
        transforms.push(tops[i]);
      }
      return [placeholders.join(" & "), "URL-" + transforms.join("+")];
    }
  }

  var transforms = [
    constructTransform("full"),
    constructTransform("base"),
    constructTransform("protocol"),
    constructTransform("param", ["utm_source"]),
    constructTransform("param", ["utm_source", "utm_medium"]),
    constructTransform(["protocol", "base"])
  ];

  var form = document.getElementsByTagName("form")[0];
  for (var i in transforms) {
    if (!transforms[i][1]) {
      continue;
    }

    var transform = transforms[i][1];
    var result = processURLTransform(transform, parsedURL);

    var placeholder = transforms[i][0];
    var id = transform.replace("+", ":").replace(" ", ".");

    var div = document.createElement("div");
    div.setAttribute("class", "form-group");

    var label = document.createElement("label");
    label.innerHTML = transform;
    label.setAttribute("for", id);
    div.appendChild(label);

    var input = document.createElement("input");
    input.id = id;
    input.value = result;
    input.setAttribute("class", "form-control");
    input.setAttribute("type", "text")
    input.setAttribute("placeholder", placeholder);
    div.appendChild(input);

    form.appendChild(div);
  }
}

I made it so the URL-xxx transform strings act as a small general-purpose query language.

To use this solution, first start by parsing the URL by calling parseURL(url) which returns an object with all the fields nicely laid out. If no URL is passed, it will use document.URL by default. Then pass the transform string and the result of parseURL to processURLTransform.

For example: processURLTransform("URL-param-utm_source+utm_medium", parsedURL) will return the values of utm_source and utm_medium from the GET query string.

processTransform actually has the ability to backtrack, which allows it to go down then back up the hierarchy. As such, a transform like "URL-protocol+param-utm_source+utm_nonexistant+utm_medium+base" will successfully fetch the protocol at the root level, the three params (where utm_nonexistant will take a default value) and the base, back at the root level.

Tested in IE8, Firefox and Chrome.

Note: Since the + character isn't valid inside of an id in HTML 4, it is replaced with a colon (:). Furthermore, the space character isn't valid either, in neither HTML 4 nor HTML 5, so it is replaced with a dot (.).

Edit 1: Added hash sign handling.

Edit 2: Added Bootstrap. Added a way to choose whether to use URI decoding (off by default). Switched from "query string" to "transform string" to avoid confusion with the HTTP query string. Switched to a form with inputs that's entirely generated with JavaScript. Made the processTransform(...) function general-purpose enough to work with arbitrary hierarchies.

Edit 3: Fixed processTransform to work the way I had intended initially. It was working fine for the purpose of the bounty despite being broken, but it now allows for an extended backtracking ability.

I just noticed that IE11 defaults to using compatibility mode for intranet sites (which is how it treats a local index.html). This can either be fixed by unticking "Display intranet sites in Compatibility View" in the Compatibility View Settings, or by adding <meta http-equiv="X-UA-Compatible" content="IE=edge"> in <head>. The JavaScript code still works despite the compatibility mode, but it breaks the presentation.
CyteBode 26 days ago
View Timeline