BUILD LOGIC - what could be better than using AWK?Printing fields using awkHow to capture the port number from this `lsof -F` output using awk (or something better)?awk comparison using arraysHow to compare (using greater than and less than symbols) strings using awkRearranging columns using awkA better way than `tee | cut | … | paste`Pattern matching find equal or less than using regular expression in awkhow I could pass variable inside awk match?Text filtering using awkWrite shell script to analysis log file
Trouble understanding overseas colleagues
How will losing mobility of one hand affect my career as a programmer?
Is this Spell Mimic feat balanced?
What does this 7 mean above the f flat
Is a roofing delivery truck likely to crack my driveway slab?
Tiptoe or tiphoof? Adjusting words to better fit fantasy races
Is there any easy technique written in Bhagavad GITA to control lust?
What would be the benefits of having both a state and local currencies?
Dot above capital letter not centred
Implement the Thanos sorting algorithm
Bash method for viewing beginning and end of file
How do I rename a LINUX host without needing to reboot for the rename to take effect?
Go Pregnant or Go Home
What's a natural way to say that someone works somewhere (for a job)?
What's the purpose of "true" in bash "if sudo true; then"
Short story about space worker geeks who zone out by 'listening' to radiation from stars
What is the intuitive meaning of having a linear relationship between the logs of two variables?
Do I need a multiple entry visa for a trip UK -> Sweden -> UK?
I'm in charge of equipment buying but no one's ever happy with what I choose. How to fix this?
Is it okay / does it make sense for another player to join a running game of Munchkin?
Can a monster with multiattack use this ability if they are missing a limb?
Your magic is very sketchy
Valid Badminton Score?
Is expanding the research of a group into machine learning as a PhD student risky?
BUILD LOGIC - what could be better than using AWK?
Printing fields using awkHow to capture the port number from this `lsof -F` output using awk (or something better)?awk comparison using arraysHow to compare (using greater than and less than symbols) strings using awkRearranging columns using awkA better way than `tee | cut | … | paste`Pattern matching find equal or less than using regular expression in awkhow I could pass variable inside awk match?Text filtering using awkWrite shell script to analysis log file
Below is how a log looks like:
Company=XYZ
Req_id=1234
Time_taken=10 sec
Status=Success
Company=ABC
Req_id=3456
Time_taken=200 sec
Status=Failure
Company=DFG
Req_id=3001
Time_taken=15 sec
Status=Success
We have to get top 3 request IDs where max time is taken.
I tried with below solution where I get request id and time taken but I am not happy with my answer:
awk -vRS= -F'[=n]' '/Time_taken/print $4,$6' test.txt | sort -nr
How could I have done this better? Use some logic to code instead of inbuilt functions. Also, could someone help me to understand -F'[=n]'
better? I have been just copying it from my previous scripts.
Output should be:
Below Request Id took more then expected
Request id 3456, Time Taken 200 sec
Request id 3001, Time Taken 15 sec
Request id 1234, Time Taken 10 sec
shell-script text-processing awk
add a comment |
Below is how a log looks like:
Company=XYZ
Req_id=1234
Time_taken=10 sec
Status=Success
Company=ABC
Req_id=3456
Time_taken=200 sec
Status=Failure
Company=DFG
Req_id=3001
Time_taken=15 sec
Status=Success
We have to get top 3 request IDs where max time is taken.
I tried with below solution where I get request id and time taken but I am not happy with my answer:
awk -vRS= -F'[=n]' '/Time_taken/print $4,$6' test.txt | sort -nr
How could I have done this better? Use some logic to code instead of inbuilt functions. Also, could someone help me to understand -F'[=n]'
better? I have been just copying it from my previous scripts.
Output should be:
Below Request Id took more then expected
Request id 3456, Time Taken 200 sec
Request id 3001, Time Taken 15 sec
Request id 1234, Time Taken 10 sec
shell-script text-processing awk
I have edited my question with what i require. Please help to get this offhold.
– Machine
yesterday
1
You've got my reopen vote, hopefully others will follow (5 are required). Please, write "I" always with capital letter.
– peterh
yesterday
I cast the last required open vote. I’ve also edited to remove the syntax highlighting of the input and output. I’ve also upvoted it because it’s now a useful, clear question that shows what you’ve already tried or researched. For future questions, I’d recommend reading How to Ask and unix.meta.stackexchange.com/q/5015/22812
– Anthony Geoghegan
yesterday
add a comment |
Below is how a log looks like:
Company=XYZ
Req_id=1234
Time_taken=10 sec
Status=Success
Company=ABC
Req_id=3456
Time_taken=200 sec
Status=Failure
Company=DFG
Req_id=3001
Time_taken=15 sec
Status=Success
We have to get top 3 request IDs where max time is taken.
I tried with below solution where I get request id and time taken but I am not happy with my answer:
awk -vRS= -F'[=n]' '/Time_taken/print $4,$6' test.txt | sort -nr
How could I have done this better? Use some logic to code instead of inbuilt functions. Also, could someone help me to understand -F'[=n]'
better? I have been just copying it from my previous scripts.
Output should be:
Below Request Id took more then expected
Request id 3456, Time Taken 200 sec
Request id 3001, Time Taken 15 sec
Request id 1234, Time Taken 10 sec
shell-script text-processing awk
Below is how a log looks like:
Company=XYZ
Req_id=1234
Time_taken=10 sec
Status=Success
Company=ABC
Req_id=3456
Time_taken=200 sec
Status=Failure
Company=DFG
Req_id=3001
Time_taken=15 sec
Status=Success
We have to get top 3 request IDs where max time is taken.
I tried with below solution where I get request id and time taken but I am not happy with my answer:
awk -vRS= -F'[=n]' '/Time_taken/print $4,$6' test.txt | sort -nr
How could I have done this better? Use some logic to code instead of inbuilt functions. Also, could someone help me to understand -F'[=n]'
better? I have been just copying it from my previous scripts.
Output should be:
Below Request Id took more then expected
Request id 3456, Time Taken 200 sec
Request id 3001, Time Taken 15 sec
Request id 1234, Time Taken 10 sec
shell-script text-processing awk
shell-script text-processing awk
edited yesterday
Anthony Geoghegan
7,94654055
7,94654055
asked Mar 23 at 11:10
MachineMachine
335
335
I have edited my question with what i require. Please help to get this offhold.
– Machine
yesterday
1
You've got my reopen vote, hopefully others will follow (5 are required). Please, write "I" always with capital letter.
– peterh
yesterday
I cast the last required open vote. I’ve also edited to remove the syntax highlighting of the input and output. I’ve also upvoted it because it’s now a useful, clear question that shows what you’ve already tried or researched. For future questions, I’d recommend reading How to Ask and unix.meta.stackexchange.com/q/5015/22812
– Anthony Geoghegan
yesterday
add a comment |
I have edited my question with what i require. Please help to get this offhold.
– Machine
yesterday
1
You've got my reopen vote, hopefully others will follow (5 are required). Please, write "I" always with capital letter.
– peterh
yesterday
I cast the last required open vote. I’ve also edited to remove the syntax highlighting of the input and output. I’ve also upvoted it because it’s now a useful, clear question that shows what you’ve already tried or researched. For future questions, I’d recommend reading How to Ask and unix.meta.stackexchange.com/q/5015/22812
– Anthony Geoghegan
yesterday
I have edited my question with what i require. Please help to get this offhold.
– Machine
yesterday
I have edited my question with what i require. Please help to get this offhold.
– Machine
yesterday
1
1
You've got my reopen vote, hopefully others will follow (5 are required). Please, write "I" always with capital letter.
– peterh
yesterday
You've got my reopen vote, hopefully others will follow (5 are required). Please, write "I" always with capital letter.
– peterh
yesterday
I cast the last required open vote. I’ve also edited to remove the syntax highlighting of the input and output. I’ve also upvoted it because it’s now a useful, clear question that shows what you’ve already tried or researched. For future questions, I’d recommend reading How to Ask and unix.meta.stackexchange.com/q/5015/22812
– Anthony Geoghegan
yesterday
I cast the last required open vote. I’ve also edited to remove the syntax highlighting of the input and output. I’ve also upvoted it because it’s now a useful, clear question that shows what you’ve already tried or researched. For future questions, I’d recommend reading How to Ask and unix.meta.stackexchange.com/q/5015/22812
– Anthony Geoghegan
yesterday
add a comment |
4 Answers
4
active
oldest
votes
Using only awk
:
function list_insert (value, id, tmp)
for (i = 1; i <= list_length; ++i)
if (value > value_list[i])
tmp = value_list[i]
value_list[i] = value
value = tmp
tmp = id_list[i]
id_list[i] = id
id = tmp
BEGIN
FS = "[= ]"
list_length = 3
$1 == "Req_id" id = $2
$1 == "Time_taken" list_insert($2, id)
END
printf("Below Request Id took more then expectedn")
for (i = 1; i <= list_length; ++i)
printf("Request id %d, time taken %d secn", id_list[i], value_list[i])
This program maintains two arrays, value_list
and id_list
, both of length list_length
. The value_list
array is sorted and contains the time values, while the id_list
array contains the request IDs corresponding to the values in the first list.
The list_insert
function inserts a new value and ID into the two arrays in such a way that the order of the value_list
array is maintained (it finds the correct location for insertion and then shuffles the remaining items towards the end).
The rest of the program reads the data as newline-delimited records of fields delimited by =
or spaces. When a request ID is found, this is saved in id
, and when a "time taken" entry is found, that ID and the time take value are inserted into the arrays.
At the end, the two arrays are used to create the output.
Testing it:
$ awk -f script.awk file
Below Request Id took more then expected
Request id 3456, time taken 200 sec
Request id 3001, time taken 15 sec
Request id 1234, time taken 10 sec
You just blowed off my mind !!! Can you help with why you have set list_length = 3? and why its required to have insert_list function ? What solution @Archemar gave below is not better ?
– Machine
19 hours ago
@Machine I would not say my solution is better or worse than any other's. I'm assuming that there may be more than three records in the in-data, and I uselist_length = 3
to get the top three results. Thelist_insert
function adds data to the arrays that the program maintains, while always keeping the top three. This kind of combines thesort
andhead
that Archemar is doing as post-processing steps.
– Kusalananda♦
19 hours ago
add a comment |
We can do this using the paragraph mode of Perl
wherein we read in paragraph sized chunk of files as records and build up a hash keyed on IDs and values as time taken. At the end of file, we reverse sort numerically on the times taken and then print the desired message.
$ perl -ln -00 -e '
%h = (%h, /^Req_id=(d+)n.*^Time_taken=(d+)/ms)}improve this answer
You just blowed off my mind !!! Can you help with why you have set list_length = 3? and why its required to have insert_list function ? What solution @Archemar gave below is not better ?
– Machine
19 hours ago
@Machine I would not say my solution is better or worse than any other's. I'm assuming that there may be more than three records in the in-data, and I uselist_length = 3
to get the top three results. Thelist_insert
function adds data to the arrays that the program maintains, while always keeping the top three. This kind of combines thesort
andhead
that Archemar is doing as post-processing steps.
– Kusalananda♦
19 hours ago
add a comment improve this answer
add a comment
We can do this using the paragraph mode of Perl
wherein we read in paragraph sized chunk of files as records and build up a hash keyed on IDs and values as time taken. At the end of file, we reverse sort numerically on the times taken and then print the desired message.
$ perl -ln -00 -e '
%h = (%h, /^Req_id=(d+)n.*^Time_taken=(d+)/ms)improve this answer
We can do this using the paragraph mode of Perl
wherein we read in paragraph sized chunk of files as records and build up a hash keyed on IDs and values as time taken. At the end of file, we reverse sort numerically on the times taken and then print the desired message.
$ perl -ln -00 -e '
%h = (%h, /^Req_id=(d+)n.*^Time_taken=(d+)/ms){
print "The below IDs took more than expected:" if scalar keys %h;
print join " ", "Req ID:", "$_," , "Time taken", $h$_, "sec"
for (sort $h$b <=> $h$a keys %h)[0..2];
' input.file
Output:
The below IDs took more than expected:
Req ID: 3456, Time taken 200 sec
Req ID: 3001, Time taken 15 sec
Req ID: 1234, Time taken 10 sec
edited 13 hours ago
answered yesterday
Rakesh SharmaRakesh Sharma
382115
382115
add a comment |
add a comment |
you need to remember request number.
I would use (this could be one lined)
awk -F= '$1 == "Req_id" r=$2 ;
$1 == "Time_taken" printf "Request id %s %s %sn",r,$1,$2 ; ' file |
sort -r -n -k5 |
head -3
which give
Request id 3456 Time_taken 200 sec
Request id 3001 Time_taken 15 sec
Request id 1234 Time_taken 10 sec
where
-F=
use=
as separator$1 == "Req_id" r=$2 ;
get last request id$1 == "Time_taken"
if line is "time taken"printf "Request id %s %s %sn",r,$1,$2 ;
print request id and seconds| sort
pipe to sort-r
reverse order-n
numeric sort (e.g. 200 greater than 15)-k5
on 5th field| head -3
get first 3 lines
add a comment |
you need to remember request number.
I would use (this could be one lined)
awk -F= '$1 == "Req_id" r=$2 ;
$1 == "Time_taken" printf "Request id %s %s %sn",r,$1,$2 ; ' file |
sort -r -n -k5 |
head -3
which give
Request id 3456 Time_taken 200 sec
Request id 3001 Time_taken 15 sec
Request id 1234 Time_taken 10 sec
where
-F=
use=
as separator$1 == "Req_id" r=$2 ;
get last request id$1 == "Time_taken"
if line is "time taken"printf "Request id %s %s %sn",r,$1,$2 ;
print request id and seconds| sort
pipe to sort-r
reverse order-n
numeric sort (e.g. 200 greater than 15)-k5
on 5th field| head -3
get first 3 lines
add a comment |
you need to remember request number.
I would use (this could be one lined)
awk -F= '$1 == "Req_id" r=$2 ;
$1 == "Time_taken" printf "Request id %s %s %sn",r,$1,$2 ; ' file |
sort -r -n -k5 |
head -3
which give
Request id 3456 Time_taken 200 sec
Request id 3001 Time_taken 15 sec
Request id 1234 Time_taken 10 sec
where
-F=
use=
as separator$1 == "Req_id" r=$2 ;
get last request id$1 == "Time_taken"
if line is "time taken"printf "Request id %s %s %sn",r,$1,$2 ;
print request id and seconds| sort
pipe to sort-r
reverse order-n
numeric sort (e.g. 200 greater than 15)-k5
on 5th field| head -3
get first 3 lines
you need to remember request number.
I would use (this could be one lined)
awk -F= '$1 == "Req_id" r=$2 ;
$1 == "Time_taken" printf "Request id %s %s %sn",r,$1,$2 ; ' file |
sort -r -n -k5 |
head -3
which give
Request id 3456 Time_taken 200 sec
Request id 3001 Time_taken 15 sec
Request id 1234 Time_taken 10 sec
where
-F=
use=
as separator$1 == "Req_id" r=$2 ;
get last request id$1 == "Time_taken"
if line is "time taken"printf "Request id %s %s %sn",r,$1,$2 ;
print request id and seconds| sort
pipe to sort-r
reverse order-n
numeric sort (e.g. 200 greater than 15)-k5
on 5th field| head -3
get first 3 lines
edited yesterday
answered yesterday
ArchemarArchemar
20.4k93973
20.4k93973
add a comment |
add a comment |
I would prefer Python over awk for anything except one-liners. Using a Python script will allow you to more easily handle poorly formatted input and to perform logging and error-handling.
Here is a basic Python script that will produce the desired output for your example input, and which has a structure which is suggestive of how you might build it out further if needed:
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
"""parse_log.py"""
import sys
from collections import OrderedDict
logfilepath = sys.argv[1]
# Define a function to parse a single block/entry in the log file
def parse_block(block):
parsed_block = dict()
lines = block.split("n")
for line in lines:
if line.startswith("Company="):
parsed_block["Company"] = line[8:]
elif line.startswith("Req_id="):
parsed_block["Required_ID"] = line[7:]
elif line.startswith("Time_taken="):
parsed_block["Time_Taken"] = line[11:]
elif line.startswith("Status="):
parsed_block["Status"] = line[7:]
else:
pass
return parsed_block
# Initialize a list to store the processed entries
parsed_blocks = list()
# Populate the list
with open(logfilepath, "r") as logfile:
blocks = logfile.read().split("nn")
for block in blocks:
parsed_block = parse_block(block)
parsed_blocks.append(parsed_block)
# Print the results
print("Below Request Id took more then expected")
for parsed_block in parsed_blocks:
print("Request id , Time Taken: ".format(parsed_block["Required_ID"], parsed_block["Time_Taken"]))
You could run it like this:
python parse_log.py data.log
On your example input, it produces the following output (as requested):
Below Request Id took more then expected
Request id 1234, Time Taken: 10 sec
Request id 3456, Time Taken: 200 sec
Request id 3001, Time Taken: 15 sec
add a comment |
I would prefer Python over awk for anything except one-liners. Using a Python script will allow you to more easily handle poorly formatted input and to perform logging and error-handling.
Here is a basic Python script that will produce the desired output for your example input, and which has a structure which is suggestive of how you might build it out further if needed:
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
"""parse_log.py"""
import sys
from collections import OrderedDict
logfilepath = sys.argv[1]
# Define a function to parse a single block/entry in the log file
def parse_block(block):
parsed_block = dict()
lines = block.split("n")
for line in lines:
if line.startswith("Company="):
parsed_block["Company"] = line[8:]
elif line.startswith("Req_id="):
parsed_block["Required_ID"] = line[7:]
elif line.startswith("Time_taken="):
parsed_block["Time_Taken"] = line[11:]
elif line.startswith("Status="):
parsed_block["Status"] = line[7:]
else:
pass
return parsed_block
# Initialize a list to store the processed entries
parsed_blocks = list()
# Populate the list
with open(logfilepath, "r") as logfile:
blocks = logfile.read().split("nn")
for block in blocks:
parsed_block = parse_block(block)
parsed_blocks.append(parsed_block)
# Print the results
print("Below Request Id took more then expected")
for parsed_block in parsed_blocks:
print("Request id , Time Taken: ".format(parsed_block["Required_ID"], parsed_block["Time_Taken"]))
You could run it like this:
python parse_log.py data.log
On your example input, it produces the following output (as requested):
Below Request Id took more then expected
Request id 1234, Time Taken: 10 sec
Request id 3456, Time Taken: 200 sec
Request id 3001, Time Taken: 15 sec
add a comment |
I would prefer Python over awk for anything except one-liners. Using a Python script will allow you to more easily handle poorly formatted input and to perform logging and error-handling.
Here is a basic Python script that will produce the desired output for your example input, and which has a structure which is suggestive of how you might build it out further if needed:
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
"""parse_log.py"""
import sys
from collections import OrderedDict
logfilepath = sys.argv[1]
# Define a function to parse a single block/entry in the log file
def parse_block(block):
parsed_block = dict()
lines = block.split("n")
for line in lines:
if line.startswith("Company="):
parsed_block["Company"] = line[8:]
elif line.startswith("Req_id="):
parsed_block["Required_ID"] = line[7:]
elif line.startswith("Time_taken="):
parsed_block["Time_Taken"] = line[11:]
elif line.startswith("Status="):
parsed_block["Status"] = line[7:]
else:
pass
return parsed_block
# Initialize a list to store the processed entries
parsed_blocks = list()
# Populate the list
with open(logfilepath, "r") as logfile:
blocks = logfile.read().split("nn")
for block in blocks:
parsed_block = parse_block(block)
parsed_blocks.append(parsed_block)
# Print the results
print("Below Request Id took more then expected")
for parsed_block in parsed_blocks:
print("Request id , Time Taken: ".format(parsed_block["Required_ID"], parsed_block["Time_Taken"]))
You could run it like this:
python parse_log.py data.log
On your example input, it produces the following output (as requested):
Below Request Id took more then expected
Request id 1234, Time Taken: 10 sec
Request id 3456, Time Taken: 200 sec
Request id 3001, Time Taken: 15 sec
I would prefer Python over awk for anything except one-liners. Using a Python script will allow you to more easily handle poorly formatted input and to perform logging and error-handling.
Here is a basic Python script that will produce the desired output for your example input, and which has a structure which is suggestive of how you might build it out further if needed:
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
"""parse_log.py"""
import sys
from collections import OrderedDict
logfilepath = sys.argv[1]
# Define a function to parse a single block/entry in the log file
def parse_block(block):
parsed_block = dict()
lines = block.split("n")
for line in lines:
if line.startswith("Company="):
parsed_block["Company"] = line[8:]
elif line.startswith("Req_id="):
parsed_block["Required_ID"] = line[7:]
elif line.startswith("Time_taken="):
parsed_block["Time_Taken"] = line[11:]
elif line.startswith("Status="):
parsed_block["Status"] = line[7:]
else:
pass
return parsed_block
# Initialize a list to store the processed entries
parsed_blocks = list()
# Populate the list
with open(logfilepath, "r") as logfile:
blocks = logfile.read().split("nn")
for block in blocks:
parsed_block = parse_block(block)
parsed_blocks.append(parsed_block)
# Print the results
print("Below Request Id took more then expected")
for parsed_block in parsed_blocks:
print("Request id , Time Taken: ".format(parsed_block["Required_ID"], parsed_block["Time_Taken"]))
You could run it like this:
python parse_log.py data.log
On your example input, it produces the following output (as requested):
Below Request Id took more then expected
Request id 1234, Time Taken: 10 sec
Request id 3456, Time Taken: 200 sec
Request id 3001, Time Taken: 15 sec
answered yesterday
igaligal
6,0661537
6,0661537
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f508152%2fbuild-logic-what-could-be-better-than-using-awk%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
-awk, shell-script, text-processing
I have edited my question with what i require. Please help to get this offhold.
– Machine
yesterday
1
You've got my reopen vote, hopefully others will follow (5 are required). Please, write "I" always with capital letter.
– peterh
yesterday
I cast the last required open vote. I’ve also edited to remove the syntax highlighting of the input and output. I’ve also upvoted it because it’s now a useful, clear question that shows what you’ve already tried or researched. For future questions, I’d recommend reading How to Ask and unix.meta.stackexchange.com/q/5015/22812
– Anthony Geoghegan
yesterday