Google Sheet Regex

 

  • You can use square brackets [] to match one of a set of characters. For example, if you wanted to match words beginning with a lower-case a or a lower-case b, you could use the pattern ^[a|b].*. The ^ character marks the beginning of a word, and the .* matches any characters that follow an a or a b.
  • You can define the end of a string by using the anchor $. For example, .*ly$ would match Finally and Gladly.
  • If you want to check for one of two regular expression components, you can use the pipe operator |. For example, Steve|Sarah would match Steve or Sarah.

Regex Formula with Filter
=FILTER(A1:E29,REGEXMATCH(B1:B29,H2))

Regex Formula with SUM AND Filter
=SUM(FILTER(D3:D30,REGEXMATCH(B3:B30,H4)))


Formula for if with regexmatch and regexreplace

In this exercise, you will use an IF() statement with REGEXMATCH() to find the branded campaigns, then replace the word Brand with Branded using REGEXREPLACE().

  • Recall that with REGEXMATCH(), you first supply the test cell, and then you supply the regex pattern.
  • With REGEXREPLACE() you supply the text (in this case the cell), your regex pattern, and the new string you want to swap in.

A pattern to match branded campaigns is already written for you in cell I2.


=IF(REGEXMATCH(B2,$I$2), REGEXREPLACE(B2, $I$2,"Branded"), B2)


Once again, the marketing director has put you in charge of complying with the social media marketing team's naming convention. This time, you are tasked with changing the ad group names to be more descriptive, so that the ad group names are a bit more informative.

To achieve this, you will need to add the word 'Buyers' to the end of the ad group name, while removing the s from the end of the word (e.g. Mugs will become Mug Buyers). Recall that using a $ can match the end of a string. - For example, ing$ will match the ing in flying.

Much as in the previous exercise, your formula will need to make use of IF()REGEXMATCH(), and REGEXREPLACE().


  • Write a regular expression in cell I2 that will match the trailing s in words like Mugs and SweatersUse the 'flying' example above as a starting point for your pattern!
  • In cell F2, add an IF() statement:
    • Test if cell C2 contains your trailing s pattern $I$2.
    • When the argument is TRUEreplace the s (identified using your pattern) with " Buyers" (don't forget the space!)
    • When it is FALSE, return the string "No Match".
  • Apply the formula to the remainder of the cells, F3:F29.



=IF(REGEXMATCH(C2,$I$2),REGEXREPLACE(C2,$I$2," Buyers"),C2)



Extract brand campaign names using REGEXEXTRACT()

After some discussions with the social media marketing team, the digital marketing director decided to simply remove the terms Brand or Branded from each of the campaign names.

This is where you come in. You know how to replace using regex but, this time, you will be using REGEXEXTRACT() to pull all the characters not equal to Brand. For example, after the extraction, DataCamp Brand will become DataCamp.

To perform this extraction, you will need to write a new regular expression. As an example, the regex (.*).Expressions extracts any number of characters preceding the text and period, (i.e. Regular would be extracted from Regular Expressions).



  • A regular expression in cell I2 that will match any branded campaign (i.e. containing Brand) is already written for you in cell I2.
  • In cell I3, write another regular expression to extract the group of characters not equal to BrandRefer to the example above for a starting point!
  • Modify the current IF() statement in cell F2 by replacing the second argument, "REGEXEXTRACT", with a REGEXEXTRACT() function.
    • The function should contain two arguments: the input text (in this case, B2), and the cell of your extract pattern.
  • Apply the formula to the remainder of the cells, F3:F29. Make sure that your pattern (I3) is an absolute reference!
IF(REGEXMATCH(B3,$I$2),REGEXEXTRACT(B3,$I$3),B3)




After much thought, and testing different naming conventions, the digital marketing director and the social media team have finally settled on a naming convention. They have decided to simply create campaign IDs.

These IDs will contain information about the source and campaign. For instance, the test IDs you create will be in the following format: the first letter from the Source, followed by an underscore, then the full Campaign Name. For example, the campaign ID for the source Yahoo with a campaign Spreadsheets would be: Y_Spreadsheets.

Recall that a begin anchor is ^. For example, ^f will match the f in flying. However, in this exercise you will need to be able to match the first letter from any word.


  • In cell I2, write a regular expression that will extract only the first character in a string.
  • In cell F2, write a formula to create the new test IDs:
    • Use REGEXEXTRACT() to pull the first letter from cell A2 using your pattern in I2 (as an absolute reference).
    • In the same cell, use t& characters to add a "_" and the campaign name (in cell B2).
  • Apply this formula to the remainder of the cells in the column, F3:F29.
Hint
  • . will match any character. So you can use ^. as your pattern in I2.
  • Make sure the REGEXEXTRACT() function includes both A2 and I2 in the arguments.
  • The character & joins strings, and there should be two instances of & in this exercise.
  • Make sure that I2 is an absolute reference.

=REGEXEXTRACT(A2,$I$2)&"_"&B2


  • (^.{6}) would extract the first 6 characters of a string.
  • The first argument in REGEXEXTRACT() should be the cell with the string of interest, and the second should be the Regex.
  • The formula should resemble: REGEXEXTRACT(string, regular expression)&"_"&campaign ID


  • Write a REGEXMATCH() function to match the column A1:A29 with the value in the source dropdown, H2.
  • Working in cell H19 again, wrap the REGEXMATCH() function from the previous step with a FILTER() function, filtering for range A1:F29.
  • Finally, still in cell H19, wrap the FILTER() function with an IF() statement, where LEN(H2) > 1 as the condition, the FILTER() function as the value if True, and range A2:F29 if False
  • =IF(LEN(H2),FILTER(A1:F29,REGEXMATCH(A1:A29,H2)),TRUE).

Comments

Popular posts from this blog

Binomial Test in Python

Slicing and Indexing in Python Pandas

Python Syntax and Functions Part2 (Summary Statistics)