34,401 questions
Tooling
0
votes
0
replies
9
views
using persistem-memory gawk how variables can created to be local and issolated from other execution instances?
The idea of Persistent-Memory gawk is fabulous because it improves the performance, size, and clarity of many scripts on static and reference data.
However, I have a significant problem in adopting ...
3
votes
3
answers
58
views
awk command to subtract constant from a column and print results
I was working on a one-liner to subtract a constant value (e.g. 100 in this case) from a specific column using awk. So far I can manage to get to where I can print the last iteration only – which ...
6
votes
3
answers
183
views
Does the awk standard unambiguously define that `a || b=c` is (or is not) valid?
Most awk implementations I have tried (gawk, mawk, original-awk, bsd awk) allow the form: a || b = c
The exception seems to be busybox which returns a syntax error unless parentheses are used:
$ ...
5
votes
1
answer
122
views
Is there a way to identify if a GNU AWK script is sourced via `-i` / `@include`?
I am looking for a way to write gawk scripts that can function both as standalone tools and function libraries that can be sourced in other scripts.
The difference is in the BEGIN block that I intend ...
3
votes
3
answers
151
views
splitting a FASTA file into a new FASTA file based on the top 100 transcripts
Essentially, I have a large FASTA file with over 100,000 transcripts in it, and I want to take the top 50 longest of those. After doing this:
awk -vRS=">" -vORS="\n" -vFS="\...
1
vote
2
answers
104
views
add special characters "[" in print text using awk
this is a part of my bash script
..
mytstamp=$(date '+%Y-%m-%d %H:%M:%S :: ')
output=$(gawk -v mt="$mytstamp" -f print_errlog.awk errlog.txt)
..
my file: <errlog.txt>
2025-10-11 14:25:...
4
votes
1
answer
181
views
Does "cmd | getline" have a limit for the passed data?
Stumbled across a certain limit (?) when passing data through "cmd | getline" (with both macOS awk 20200816 and GNU Awk 5.3.1).
Reproducible examples:
file.txt - a text file filled with &...
3
votes
1
answer
172
views
Why is awk not identifying the row of a file which is being extracted using an id from a second file? [closed]
I have a large file with several columns and millions of rows. The first column is a unique id for each record which is a long integer. From this large file I need to create a subset with few thousand ...
-3
votes
1
answer
161
views
How can I dissect a shell 1-liner workflow to understand what this git command does? [closed]
I am updating an existing pipeline on GitLab that creates an automatic cascade on GitLab.
When a merge request is merged, the pipeline is triggered to perform an automatic commit to a lower branch (if ...
1
vote
0
answers
38
views
awk does not print a variable's value [duplicate]
Consider this awk program:
BEGIN {
pat = "<patternprefix> " var "<patternsuffix>"
print ">>> " var
print ">>> ...
4
votes
5
answers
299
views
How to merge two CSV files based on matching values in different columns and keep unmatched rows with placeholders?
I'm working on a data cleaning task and could use some help. I have two CSV files with thousands of rows each:
File A contains product shipment records.
File B contains product descriptions and ...
4
votes
2
answers
129
views
awk printf long number padding output incorrect
Arch linux 6.15.7-zen1-1-zen,
$ awk -V
GNU Awk 5.3.2, API 4.0, PMA Avon 8-g1, (GNU MPFR 4.2.2, GNU MP 6.3.0)
Start with y.csv:
4 2016201820192020
5 20162018201920202023
5 20162018201920202024
5 ...
-1
votes
1
answer
167
views
Using sed to extract second apperance of a block [duplicate]
I have the following source file named test.txt:
text before
---BEGIN MARKER---
content 1
---END MARKER---
text between
---BEGIN MARKER---
content 2
---END MARKER---
text between
---BEGIN MARKER--...
2
votes
7
answers
163
views
Modify a column with awk and a bash script
I have a test.txt file looking like this :
a,1,A
b,2,B
c,3,C
d,4,D
e,5,E
f,6,F
I want to modify the second field with some condition :
if value is 1 I modify it to 1_PLUS
if value is 4 I modify it to ...
-1
votes
1
answer
177
views
How can I make awk print a text string followed by the entire line piped from a previous command?
I have 8 blocks of code: this, with a different path and site name. My goal is when the first word of the return is something other than "Success:" print the entire line, preceded by the ...
4
votes
4
answers
241
views
Dynamic precision in awk printf using shell variable
I think this is simple, but it's not working for me.
This is what I have.
float=$(awk -v res="$result" 'BEGIN {printf "%.2f", res / 1000}')
I want to use a variable to set the ...
3
votes
7
answers
334
views
Remove only odd lines' line breaks, with Vim, sed or awk
I have the following file:
line 1
abc
line 2
def
line 3
ghi
.....
.....
I need it to become:
line 1 abc
line 2 def
line 3 ghi
......
......
I know how to remove newlines, but not odd or even line ...
2
votes
8
answers
189
views
Remove a string character between 2 special characters in the headers of a fastq file
I have a fastq file containing several sequences with headers such as :
tail SRR11149706_1.fastq
@SRR11149706.16630586 16630586/1
CCCAACAACAACAACAGCAACCTCCTCACGCCAACGCCGATCCCGCCGCTGTTTTCCAA
@...
6
votes
5
answers
152
views
Unpivot a line in sed or awk retaining parent fields
Source lines formatting, for example:
Value11 | Value12 | ValueA,ValueB,ValueC
Value21 | Value22 | ValueA2
Desired output:
Value11 | Value12 | ValueA
Value11 | Value12 | ValueB
Value11 | Value12 | ...
3
votes
8
answers
234
views
Extracting lines from two files where there is a match of value in specific columns
I have two tab-separated files (with thousand of lines each):
File1:
anno1.g20653.t1 anno1.g20674.t1 eud1g02416 eud1g02458 27 +
anno2.g3796.t1 anno1.g20698.t1 eud1g02520 eud1g02556 28 +
File2:
...
6
votes
5
answers
261
views
AWK global match function -- how can I improve it?
When using awk's match() function I find only the first match of a given search pattern. I find this sort of limiting, so I'm trying to find something that gives me all of the matches in a record, ...
4
votes
6
answers
213
views
How to split fields respecting only the left-most separator?
I am attempting to implement selected records reformatting in Bash with AWK as a natural first pick for the job:
#!/bin/bash
process() {
declare payload="$1"
declare -a keysarr=(&...
3
votes
6
answers
158
views
how to select matching rows in multiple files with AWK
I have over forty files with the following structure:
file1 first 21 lines
8191 M0
139559 M1
79 M10
1 M10007
1 M1006
1 M10123
file2 first 21 lines
8584 M0
119837 M1
72 M10
1 M10003
1 M10045
1 M1014
...
-1
votes
4
answers
143
views
How to get substring between html tags in awk
I have a curl response with some plain strings like below -
"version":"1.1.8".
I would like to extract 1.1.8 from the raw text using sed or awk.
I tried the below ...
12
votes
11
answers
765
views
Getting the total count of IDs from a comma delimited list of IDs that can contain ranges with awk
I'm trying to get the total count of CPUs allocated to a job from SLURM's scontrol --details --oneliner show job output. The format is a comma delimited list of CPU IDs that can contain hyphen-...
2
votes
2
answers
123
views
Do we really have to copy the whole array to make it an element of an other array with GNU awk?
I would like to assign an array as an array element, with GNU awk.
Here's a non-working example illustrating the problem:
echo X1 X2 X3 X4 |
awk '
function dummy_split(arr) { split($0, arr) }
...
-1
votes
1
answer
93
views
awk command in linux does not execute; always get the error (awk: line 1: syntax error at or near '}' )
I am a beginner with using linux bash for bioinformatics purpose and recently i encountered some error with this 'awk' command. ChatGPT suggestion is not helping and the task is very basic. I have a ...
4
votes
2
answers
187
views
Filtering rows between two files based on correlation in awk
After reading several topics such as this, this and this, I am still confused about how to tackle the following issue. I have a bunch of files I would like to filter based on the correlation (Pearson) ...
5
votes
4
answers
84
views
sed delete and replace on top of file is not working as expected
I want to replace top 3 lines of my file. For this task, these two commands work fine, but then I get two .bak (saved) files...
These work:
sed -i.orig1 '1,3d' /my/path/file;
sed -i.orig2 '1s/^/line1 ...
1
vote
5
answers
147
views
How to add a field to a csv after deriving its value in a read line
I'm attempting to process in a bash script using awk/cut/sed a simple CSV that looks like this
id,version
84544,abcd v2.1.0-something
3439,abcd a82f1a
3,abcd 2.2.1-bar
Where
abcd is constant
the ...
3
votes
4
answers
134
views
empty string in sprintf in awk
I have the following line of code insider of a jupyter notebook:
!ls data/kapitel*.txt \
| while read file; do \
dirname="${file%.*}"; \
mkdir -p "$dirname"; \
awk \
...
-1
votes
1
answer
189
views
Suggestions for making a file from a bigger file with grep or?
Looking for a suggestion that would be much faster. I have a large (232GB) file mongo backup. I want to take out only the April 24th lines and make a new file containing only this date or any date of ...
1
vote
4
answers
133
views
Printing arbitrary relative dates like git log --relative-date
I'm working on some dev tooling in/for Bash 5.2 (Git Bash for Windows) environments.
What I'm looking for
I'd like to be able to take an arbitrary UTC timestamp (e.g. @1746111769) and print a coarse, ...
0
votes
4
answers
160
views
bash + how to sort log according to specific field
The goal is to extract the numeric part of cost:xxxms and sort the entire log lines
example of log
/var/log/hadoop/hdfs/hadoop-hdfs-datanode-datanode01.star.com.log.7:2025-04-24 11:56:57,334 WARN ...
-3
votes
2
answers
96
views
How to delete a line starting with the character * and all characters from column 24 on [closed]
I found an assembler file I would like to upload into an old computer.
However, the size is too big for my limited RAM.
So, the size should be reduced by deleting the non necessary comments
original ...
11
votes
7
answers
1k
views
Reformat numbers, inserting separators at fixed positions
There are many lines in a file, like the following:
000100667 ===> 000102833
005843000 ===> 005844000
011248375 ===> 011251958
I would like to insert specific separators into the numbers ...
8
votes
7
answers
338
views
awk to extract a block of text
I am trying to figure out an awk command/script to extract a block of text from a large file. The file subsection I am interested in is this:
Board Info: #512
Manufacturer: "Dell Inc."
...
1
vote
4
answers
167
views
How to make Gawk work with files found by the “find” command with the corresponding output of the “-printf” option available as a variable?
I want to do the following:
Find a particular set of files with the find command;
For any found file, put the corresponding output of the -printf option to a variable called str and pass it to Gawk (...
-4
votes
1
answer
98
views
Find and replace XML tag and not it's value [closed]
I have an XML file that contains the following data:
<Extrinsic name="CommodityVendor">1234567</Extrinsic>
<Extrinsic name="buyerVatID">1122334455</Extrinsic&...
-2
votes
5
answers
127
views
Bash getting a specific section from a file that spans multiple lines [closed]
I am using bash on Ubuntu server. My goal is to get the lines from specific section. I can't think of what tool to use to accomplish this (awk, grep -P, sed, ???). It is the specific section that I ...
1
vote
3
answers
91
views
Why does GNU AWK sub function not act on the selected field in this case?
AWK recognises the field value "b" in this example:
$ printf "a ab c d b" | awk '{for (i=1;i<=NF;i++) print $i}'
a
ab
c
d
b
$ printf "a ab c d b" | awk '{for (i=...
-1
votes
5
answers
102
views
Adding an additional line of code to a file after the logical end of a statement with a match string
I have a file that is full of lines like:
snprintf(log_buffer,...
...
...
...
);
If I use sed, I can find these lines with "snprint.*...
5
votes
5
answers
148
views
Match pattern by passing a variable to awk as an alternative to grep -B
grep -B does not work on AIX so I am looking for an alternative with awk
I have a file with the following content
05/25/2025 M 301510sa AIX is vulnerable to information disclosure (CVE-...
3
votes
3
answers
113
views
How can I pass filename prefix to AWK command for file split?
I am using this AWK command to produce a file for each value in the first column of input:
awk -F "," '{print > $1 ".csv" }' test.csv
test.csv content
1,Rahul,
2,Atul,
3,...
4
votes
7
answers
159
views
Conditionally add characters to beginning of every line in a file
I am attempting to create a bash script that prepends characters to the start of all lines in a markdown file which do not begin with a '#' character.
For example, say we have example.md:
# Title
...
0
votes
3
answers
68
views
Get return value from awk script inside jenkins pipeline
My Jenkins pipeline gets exception. The script is below. It runs well if buildIdentifier=${echo $git_branch | awk -F"/" '{ print \$3}'} is removed.
git_branch has value remotes/origin/...
2
votes
2
answers
129
views
Remove a string containing a substring that will require a wildcard
First time I have had to post here to solve a problem. I am guessing I am missing something easy. Spent a bout four hours yesterday beating my head up against something I though was going to be simple....
-3
votes
2
answers
47
views
How to filter out a row if it has a value in one column higher than the value on the same column but from another row [closed]
I want to filter this table, with this rule: if the value in column 2 corresponding to geneA is higher than gene B and gene C, the gene B and C are filtered out. If the value in column 2 corresponding ...
2
votes
1
answer
127
views
Awk in a bash script as a variable
I'm using a python program in a bash script. The python program directly spits out an output.csv file (can't change this). I wanted to pipe to an awk command and extract only one column, but that ...
0
votes
1
answer
84
views
Exclude lines with duplicate values in awk
I have a tsv file like this:
chr1 28932 29543 chr1 29159 29422 RNAPOLII_T1_pos_1_q05_peak_1 114 . 5.55679 14.5827 11.4511 119
chr1 199425 200055 . -1 -1 . . . . . ...