How To Enable Gaming Mode on Android 12

Back in Februry, folks at XDA had spotted an unreleased Gaming Dashboard in the works for Android 12. But up until now, there were no clear...

Saturday

Remove Duplicates from Comma-Delimited Strings in Google Sheets

Posted by   on

You may already have the formula to remove duplicates from comma-delimited strings in Google Sheets. But do you have an array formula to expand for each row? Possibly, won’t, am I right?


The purpose of the array formula here is to remove duplicates from comma (pipe or any other) delimited strings from each row by residing in the topmost row in the range.


I am not simply going to provide you the array formula. But I have included the total steps involved, and there are nine main steps.


If you are a newbie, then you might not understand the steps. But definitely, you can understand what is going behind the array formula that removes duplicates from comma delimited strings. In addition to that, you can start learning some tips from the steps, such as regex use.


Non-Array Formula to Remove Duplicates from Comma-Delimited Strings


Regarding the non-array formula, you can use the below one in cell B2 for the comma-delimited string in cell A2 that contains duplicates.


=ArrayFormula(textjoin(", ",true,unique(trim(split(A2,",")),true)))

Non-Array Formula to Remove Duplicates from Comma-Delimited Strings

The formula separates the strings by splitting (SPLIT) them. Then UNIQUE and combine (TEXTJOIN) it back to a comma-delimited string.


Even if the formula contains the ARRAYFORMULA function, it will only remove duplicates from the comma-delimited string in cell A2. For cell A3, you should drag this formula down.


The SPLIT function in the formula creates an array by splitting comma delimited strings. That’s why I have used the ARRAYFORMULA function.


What about writing an expanding array formula to remove duplicates from the comma-delimited strings in Google Sheets?


As I have mentioned, I do have the expanding formula. If you use the same in the above example, you should insert the formula only in cell B2. It will take the range A2:A and expand the results accordingly.


Array Formula to Remove Duplicates from Comma-Delimited Strings


The array formula is a bit lengthier. Here it is.


=ArrayFormula(transpose(split(textjoin(", ",1,IFERROR((if(row(indirect("A2:A"&counta(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),","))<>""))))))-match(REGEXEXTRACT(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),","))<>""))),"\w+"),REGEXEXTRACT(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&substitute(A2:A,", ",","&row(A2:A)&" "),","))<>""))),"\w+"),0)=1,"|",)&REGEXREPLACE(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),","))<>""))),"^(\w*\W*)1","")))),"|")))

For the sample sheet with the formula entered, please scroll down to the end of the post.


As I have mentioned, we can explain the above formula in 9 steps. For the explanation purpose, I am going to classify them under three categories. Here are them.



  1. Split Strings and Row Wise Unique (3 steps).

  2. Elements to Combine Strings Back to Its Rows (4 steps).

  3. Final Steps to Remove Duplicates from Comma Delimited Strings (2 steps).


I hope the below steps could enlighten you on what is happening behind the array formula that removes duplicates from the comma-delimited strings.


Notes


Please note that I won’t explain how to combine the formulas under each step to make the final formula. It’s easy to do but a little trickier.


The reason, the step-by-step formulas are for a limited (closed) range. But the final combined one is for an infinite (open) range.


We are considering comma delimited strings. If the strings are separated by some other delimiter, such as a pipe; then, you should make the changes accordingly in the formula.


Let’s go to the steps to understand how the above array formula removes duplicates from comma-delimited strings in Google Sheets.


Split Strings and Row Wise Unique


The below formulas go in cells B2, C2, and D2. Please see the image below to understand what they return.


1. Inserting Row Numbers and a White Space Immediately After the Delimiters.


=ArrayFormula(trim(substitute(A2:A,",",","&row(A2:A)&" ")))

2. Inserting Row Numbers at the Beginning of Values in Each Row.


=ArrayFormula(if(len(A2:A),row(A2:A),)&" "&B2:B)

The purpose of the above two formulas is to add row numbers to each string. For example, all the values in row # 2 will get row number 2 preceding them.


3. Split Strings Based on Delimiter, Unique, and Filter.


=ArrayFormula(to_text(unique(filter(flatten(split(C2:C,",")),flatten(split(C2:C,","))<>""))))

Steps 1 to 3 - Split Strings and Row Wise Unique

When you unique the strings this way, the formula will only remove duplicates from the same row. Because each value has its row numbers associated with it.


For example, if the text “BB” is in row 2 and row # 100 one time each, they won’t get eliminated because the values are now “2 BB” and “100 BB”, not duplicates.


We have solved one main issue that we may encounter when trying to remove duplicates from comma-delimited strings.


Elements to Combine Strings Back to Its Rows


The below formulas go in cells E2, F2, G2, and H2.


The step # 3 formula in the previous step returns unique row number added values. Please see column D. Below is the 4th step, which will extract the row numbers, and in the fifth step, the rest of the values. Please refer to the image below.


4. Extracting Row Numbers.


=ArrayFormula(REGEXEXTRACT(D2:D16,"\w+"))

5. Extracting Strings.


=ArrayFormula(REGEXREPLACE(D2:D16,"^(\w*\W*)1",""))

Related: Regex to Get All Words after Nth Word in a Sentence in Google Sheets.


The purpose of the following two (steps # 6 and 7) formulas is to return a separator to mark the end of each row based on column D.


We will achieve that by returning the running count of step # 4 output and then replacing sequence number 1 with a pipe and the rest of the values with blanks.


You can better understand them with the help of the screenshot below.


6. Running Count Based on Extracted Row Numbers.


=ArrayFormula(row(A2:A16)-match(E2:E16,E2:E16,0))

7. Delimiter to Separate Rows.


=ArrayFormula(if(G2:G16=1,"|",))

Note:- Here, I have used the | character. If your strings are separated by the same, not a comma, then use here a ~ character instead.


Steps 4 to 7 - Elements to Combine Strings Back to Its Rows

Final Formula that Removes Duplicates from Comma-Delimited Strings


Please see columns F and H. We want those columns in the final step. First, we will combine the pipe (or tilde) separators in column H with the strings in column F.


Then join them using a comma separator, split at the pipe (or tilde) separator, and transpose.


The following formulas go in cells I2 and J2.


8. Add Separators with Extracted Strings.


=ArrayFormula(H2:H16&F2:F16)

9. Combine Strings Using Comma Delimiter and Split Based on Separator.


=transpose(split(textjoin(", ",1,I2:I16),"|"))

Steps 8 and 9 - Array Formula to Remove Duplicates from Comma Delimited Strings

This way we can remove duplicates from comma-delimited strings in Google Sheets.


Pros and Cons


Pros



  1. Array Formula:- It’s an array formula. So it will automatically include new values in A2:A.

  2. Row Wise Duplicate Identification:- One of the advantages of the above formula is that it removes duplicates based on rows. I have already mentioned the same in the post. If that’s not your priority, then there is no need to go for such a lengthier formula. You can try this one – Extract Unique Values from a Comma Separated List in Google Sheets.


Cons



  1. It may resource hungry. I didn’t test the formula in a larger range.

  2. Blank Row Issue:- For example, if the comma-delimited strings are in A2:A100, there shouldn’t be any blank row in between.

  3. May return error in a larger data range.

  4. May contain other unknown flaws (you can notify the same via comments).


Thanks for the stay. Enjoy!


Sample_Sheet_15521

No comments:
Write Comments

Hello Friends, welcome to autobloginc.blogspot.com we Hope You'll like it - COntact US
!!THANK YOU FOR YOUR SUPPORT!!