Multiple column matching and adjusting with awk Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Community Moderator Election Results Why I closed the “Why is Kali so hard” questionMatching values within columnsMatch multiple regular expressions from a single file using awkcompare two files based on a column and print itChanging the sign (+ or -) of a number based on non-matching columnsHow to use awk to correct and unify a corrupted file with multiple columns and lines?How to use command-line argument as awk regex matching expression?Filter rows with a specific header name and containing “1” in a columnMatching columns of different csv files, not working when column value is different lengthawk search and replace string in a specific column of CSV fileawk or bash to read IDs from CSV, match rows in second CSV, change value in 2nd column

Can the prologue be the backstory of your main character?

Antler Helmet: Can it work?

What kind of display is this?

What is the largest species of polychaete?

How to say that you spent the night with someone, you were only sleeping and nothing else?

Interesting examples of non-locally compact topological groups

Strange behaviour of Check

Cold is to Refrigerator as warm is to?

How did the aliens keep their waters separated?

If I can make up priors, why can't I make up posteriors?

When communicating altitude with a '9' in it, should it be pronounced "nine hundred" or "niner hundred"?

3 doors, three guards, one stone

Single author papers against my advisor's will?

Using "nakedly" instead of "with nothing on"

Can a monk deflect thrown melee weapons?

How to rotate it perfectly?

Who can trigger ship-wide alerts in Star Trek?

Was credit for the black hole image misattributed?

Why use gamma over alpha radiation?

Can smartphones with the same camera sensor have different image quality?

How should I respond to a player wanting to catch a sword between their hands?

Does a C shift expression have unsigned type? Why would Splint warn about a right-shift?

When is phishing education going too far?

Estimate capacitor parameters



Multiple column matching and adjusting with awk



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Community Moderator Election Results
Why I closed the “Why is Kali so hard” questionMatching values within columnsMatch multiple regular expressions from a single file using awkcompare two files based on a column and print itChanging the sign (+ or -) of a number based on non-matching columnsHow to use awk to correct and unify a corrupted file with multiple columns and lines?How to use command-line argument as awk regex matching expression?Filter rows with a specific header name and containing “1” in a columnMatching columns of different csv files, not working when column value is different lengthawk search and replace string in a specific column of CSV fileawk or bash to read IDs from CSV, match rows in second CSV, change value in 2nd column



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








3















I have a file that looks like this



 ID A1 A2 A3
1 A G A
2 T G A
3 T A G
4 T G A
5 A A G
6 A C A
7 C T G


which is thousands of rows long and made up of G,C,T,A, where G complements C and A complements T. What I'm trying to do is to search for a match for A1 in either in A2 or A3. If there is a match then for it to be left as it is and if there isn't to change A2 and A3 to their complements i.e. A=T and G=C and vice versa.



So the output would be:



 ID A1 A2 A3
1 A G A
2 T C T
3 T T C
4 T C T
5 A A G
6 A C A
7 C A C


I thought I could do it by using awk to filter matching and unmatching IDs using these:



 awk '' mergedlist > nonmatchlist


and



 awk '' mergedlist > matchlist


but it only worked for one variable i.e T in the former and A in the latter.










share|improve this question






























    3















    I have a file that looks like this



     ID A1 A2 A3
    1 A G A
    2 T G A
    3 T A G
    4 T G A
    5 A A G
    6 A C A
    7 C T G


    which is thousands of rows long and made up of G,C,T,A, where G complements C and A complements T. What I'm trying to do is to search for a match for A1 in either in A2 or A3. If there is a match then for it to be left as it is and if there isn't to change A2 and A3 to their complements i.e. A=T and G=C and vice versa.



    So the output would be:



     ID A1 A2 A3
    1 A G A
    2 T C T
    3 T T C
    4 T C T
    5 A A G
    6 A C A
    7 C A C


    I thought I could do it by using awk to filter matching and unmatching IDs using these:



     awk '' mergedlist > nonmatchlist


    and



     awk '' mergedlist > matchlist


    but it only worked for one variable i.e T in the former and A in the latter.










    share|improve this question


























      3












      3








      3








      I have a file that looks like this



       ID A1 A2 A3
      1 A G A
      2 T G A
      3 T A G
      4 T G A
      5 A A G
      6 A C A
      7 C T G


      which is thousands of rows long and made up of G,C,T,A, where G complements C and A complements T. What I'm trying to do is to search for a match for A1 in either in A2 or A3. If there is a match then for it to be left as it is and if there isn't to change A2 and A3 to their complements i.e. A=T and G=C and vice versa.



      So the output would be:



       ID A1 A2 A3
      1 A G A
      2 T C T
      3 T T C
      4 T C T
      5 A A G
      6 A C A
      7 C A C


      I thought I could do it by using awk to filter matching and unmatching IDs using these:



       awk '' mergedlist > nonmatchlist


      and



       awk '' mergedlist > matchlist


      but it only worked for one variable i.e T in the former and A in the latter.










      share|improve this question
















      I have a file that looks like this



       ID A1 A2 A3
      1 A G A
      2 T G A
      3 T A G
      4 T G A
      5 A A G
      6 A C A
      7 C T G


      which is thousands of rows long and made up of G,C,T,A, where G complements C and A complements T. What I'm trying to do is to search for a match for A1 in either in A2 or A3. If there is a match then for it to be left as it is and if there isn't to change A2 and A3 to their complements i.e. A=T and G=C and vice versa.



      So the output would be:



       ID A1 A2 A3
      1 A G A
      2 T C T
      3 T T C
      4 T C T
      5 A A G
      6 A C A
      7 C A C


      I thought I could do it by using awk to filter matching and unmatching IDs using these:



       awk '' mergedlist > nonmatchlist


      and



       awk '' mergedlist > matchlist


      but it only worked for one variable i.e T in the former and A in the latter.







      linux bash text-processing awk






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 16 hours ago









      Rui F Ribeiro

      42.1k1483142




      42.1k1483142










      asked Aug 14 '15 at 19:41









      AsfoundAsfound

      6717




      6717




















          3 Answers
          3






          active

          oldest

          votes


















          3














          perl -lane 'sub flip if ($_[0] eq "T") "A" elsif ($_[0] eq "A") "T" elsif ($_[0] eq "G") "C" elsif ($_[0] eq "C") "G" else $_[0] if (!($F[1] eq $F[2] or $F[1] eq $F[3])) $F[2] = flip($F[2]); $F[3] = flip($F[3]) print "@F"' < input


          Should be easy to port back to awk as it's not really doing anything fancy, but that would take me more time to figure out.






          share|improve this answer






























            3














            You could construct an associative array as a lookup table for the complements e.g.



            awk '
            BEGIN
            complement["A"]="T"; complement["T"]="A";
            complement["C"]="G"; complement["G"]="C";


            NR>1 && $3!=$2 && $4!=$2
            $3 = complement[$3];
            $4 = complement[$4];



            print;

            ' file





            share|improve this answer






























              1














              Alternatively to an array as suggested by @steeldriver, you could define a function:



              awk '
              BEGIN FS == " +"
              NR == 1 print $0
              function CHANGE( F )

              if ( F == "A" ) F = "T"
              else if ( F == "T" ) F = "A"
              else if ( F == "C" ) F = "G"
              else F = "C"
              return F

              NR >= 2
              ' file





              share|improve this answer























                Your Answer








                StackExchange.ready(function()
                var channelOptions =
                tags: "".split(" "),
                id: "106"
                ;
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function()
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled)
                StackExchange.using("snippets", function()
                createEditor();
                );

                else
                createEditor();

                );

                function createEditor()
                StackExchange.prepareEditor(
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: false,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: null,
                bindNavPrevention: true,
                postfix: "",
                imageUploader:
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                ,
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                );



                );













                draft saved

                draft discarded


















                StackExchange.ready(
                function ()
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f223295%2fmultiple-column-matching-and-adjusting-with-awk%23new-answer', 'question_page');

                );

                Post as a guest















                Required, but never shown

























                3 Answers
                3






                active

                oldest

                votes








                3 Answers
                3






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                3














                perl -lane 'sub flip if ($_[0] eq "T") "A" elsif ($_[0] eq "A") "T" elsif ($_[0] eq "G") "C" elsif ($_[0] eq "C") "G" else $_[0] if (!($F[1] eq $F[2] or $F[1] eq $F[3])) $F[2] = flip($F[2]); $F[3] = flip($F[3]) print "@F"' < input


                Should be easy to port back to awk as it's not really doing anything fancy, but that would take me more time to figure out.






                share|improve this answer



























                  3














                  perl -lane 'sub flip if ($_[0] eq "T") "A" elsif ($_[0] eq "A") "T" elsif ($_[0] eq "G") "C" elsif ($_[0] eq "C") "G" else $_[0] if (!($F[1] eq $F[2] or $F[1] eq $F[3])) $F[2] = flip($F[2]); $F[3] = flip($F[3]) print "@F"' < input


                  Should be easy to port back to awk as it's not really doing anything fancy, but that would take me more time to figure out.






                  share|improve this answer

























                    3












                    3








                    3







                    perl -lane 'sub flip if ($_[0] eq "T") "A" elsif ($_[0] eq "A") "T" elsif ($_[0] eq "G") "C" elsif ($_[0] eq "C") "G" else $_[0] if (!($F[1] eq $F[2] or $F[1] eq $F[3])) $F[2] = flip($F[2]); $F[3] = flip($F[3]) print "@F"' < input


                    Should be easy to port back to awk as it's not really doing anything fancy, but that would take me more time to figure out.






                    share|improve this answer













                    perl -lane 'sub flip if ($_[0] eq "T") "A" elsif ($_[0] eq "A") "T" elsif ($_[0] eq "G") "C" elsif ($_[0] eq "C") "G" else $_[0] if (!($F[1] eq $F[2] or $F[1] eq $F[3])) $F[2] = flip($F[2]); $F[3] = flip($F[3]) print "@F"' < input


                    Should be easy to port back to awk as it's not really doing anything fancy, but that would take me more time to figure out.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Aug 14 '15 at 19:56









                    thrigthrig

                    25.3k23257




                    25.3k23257























                        3














                        You could construct an associative array as a lookup table for the complements e.g.



                        awk '
                        BEGIN
                        complement["A"]="T"; complement["T"]="A";
                        complement["C"]="G"; complement["G"]="C";


                        NR>1 && $3!=$2 && $4!=$2
                        $3 = complement[$3];
                        $4 = complement[$4];



                        print;

                        ' file





                        share|improve this answer



























                          3














                          You could construct an associative array as a lookup table for the complements e.g.



                          awk '
                          BEGIN
                          complement["A"]="T"; complement["T"]="A";
                          complement["C"]="G"; complement["G"]="C";


                          NR>1 && $3!=$2 && $4!=$2
                          $3 = complement[$3];
                          $4 = complement[$4];



                          print;

                          ' file





                          share|improve this answer

























                            3












                            3








                            3







                            You could construct an associative array as a lookup table for the complements e.g.



                            awk '
                            BEGIN
                            complement["A"]="T"; complement["T"]="A";
                            complement["C"]="G"; complement["G"]="C";


                            NR>1 && $3!=$2 && $4!=$2
                            $3 = complement[$3];
                            $4 = complement[$4];



                            print;

                            ' file





                            share|improve this answer













                            You could construct an associative array as a lookup table for the complements e.g.



                            awk '
                            BEGIN
                            complement["A"]="T"; complement["T"]="A";
                            complement["C"]="G"; complement["G"]="C";


                            NR>1 && $3!=$2 && $4!=$2
                            $3 = complement[$3];
                            $4 = complement[$4];



                            print;

                            ' file






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Aug 14 '15 at 20:18









                            steeldriversteeldriver

                            37.8k45489




                            37.8k45489





















                                1














                                Alternatively to an array as suggested by @steeldriver, you could define a function:



                                awk '
                                BEGIN FS == " +"
                                NR == 1 print $0
                                function CHANGE( F )

                                if ( F == "A" ) F = "T"
                                else if ( F == "T" ) F = "A"
                                else if ( F == "C" ) F = "G"
                                else F = "C"
                                return F

                                NR >= 2
                                ' file





                                share|improve this answer



























                                  1














                                  Alternatively to an array as suggested by @steeldriver, you could define a function:



                                  awk '
                                  BEGIN FS == " +"
                                  NR == 1 print $0
                                  function CHANGE( F )

                                  if ( F == "A" ) F = "T"
                                  else if ( F == "T" ) F = "A"
                                  else if ( F == "C" ) F = "G"
                                  else F = "C"
                                  return F

                                  NR >= 2
                                  ' file





                                  share|improve this answer

























                                    1












                                    1








                                    1







                                    Alternatively to an array as suggested by @steeldriver, you could define a function:



                                    awk '
                                    BEGIN FS == " +"
                                    NR == 1 print $0
                                    function CHANGE( F )

                                    if ( F == "A" ) F = "T"
                                    else if ( F == "T" ) F = "A"
                                    else if ( F == "C" ) F = "G"
                                    else F = "C"
                                    return F

                                    NR >= 2
                                    ' file





                                    share|improve this answer













                                    Alternatively to an array as suggested by @steeldriver, you could define a function:



                                    awk '
                                    BEGIN FS == " +"
                                    NR == 1 print $0
                                    function CHANGE( F )

                                    if ( F == "A" ) F = "T"
                                    else if ( F == "T" ) F = "A"
                                    else if ( F == "C" ) F = "G"
                                    else F = "C"
                                    return F

                                    NR >= 2
                                    ' file






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Aug 15 '15 at 18:32









                                    FiximanFiximan

                                    3,308625




                                    3,308625



























                                        draft saved

                                        draft discarded
















































                                        Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid


                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.

                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function ()
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f223295%2fmultiple-column-matching-and-adjusting-with-awk%23new-answer', 'question_page');

                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        -awk, bash, linux, text-processing

                                        Popular posts from this blog

                                        Creating 100m^2 grid automatically using QGIS?Creating grid constrained within polygon in QGIS?Createing polygon layer from point data using QGIS?Creating vector grid using QGIS?Creating grid polygons from coordinates using R or PythonCreating grid from spatio temporal point data?Creating fields in attributes table using other layers using QGISCreate .shp vector grid in QGISQGIS Creating 4km point grid within polygonsCreate a vector grid over a raster layerVector Grid Creates just one grid

                                        What is this called? Old film camera viewer?What makes a good film camera?What to do with an old film camera?What should one look for when buying a used film camera?What is the value and age of this pre-1967 Ricoh 35 mm camera?DSLR recommendation, question about old Canon 35mm film Camera & lensesCan anyone identify the silver rangefinder-style camera in this advertisement?What kind of a Polaroid 600-camera is this?Will an old film camera still work even when not used in a very long time?What is this camera / Can I develop the film?How to fit an action camera into antique (bellows) housing?What to check when buying used and old film bodies?

                                        Why is this plane circling around the Lucknow airport every day?Why do aircraft on Flight Radar 24 jump around randomly sometimes?What airport has this walkway over a taxiway?How does Chicago O'Hare's tower sequence aircraft at peak capacity?Which airport is featured in this Delta commercial?After a crash, for how long is the airport closed?Can a passenger plane stand still in the air, or hover at a fixed location above a ground?What are those trucks towing around, and why?What is this airport outside of Cairo, Egypt?Which US airport has the lowest circling MDH?What is this airport video?