Copy a specific percentage of each file in a directory to a new file

Wilbeibi

For example, we have N files (file1, file2, file3 ...)

We need first 20% of them, the result directory should be like (file1_20, file2_20, file3_20 ...).

I was thinking use wc to get the lines of the file, then times 0.2

Then use head to get 20% and then redirect to a new file, but i don't know how to automate it.

jmunsch

So creating a single example to work from:

[email protected]:~# echo {0..100} > file1        
[email protected]:~# cat file1
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

We can grab the size of the file in bytes with stat:

[email protected]:~# stat --printf %s "file1"
294

And then using bc we can multipy the size by .2

[email protected]:~# echo "294*.2" | bc
58.8

However we get a float so lets convert it to an integer for head ( dd might work here too ):

[email protected]:~# printf %.0f "58.8" 
59

And finally the first twenty percent (give or take a byte) of file1:

[email protected]:~# head -c "59" "file1" 
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Putting it together we could then do something like this

mkdir -p a_new_directory
for f in file*; do
    file_size=$(stat --printf %s "$f")
    percent_size_as_float=$(echo "$file_size*.2" | bc)
    float_to_int=$(printf %.0f "$percent_size_as_float")
    grab_twenty=$(head -c "$float_to_int" "$f")
    new_fn=$(printf "%s_20" "$f") # new name file1_20
    printf "$grab_twenty" > a_new_directory/$new_fn
done

where f is a place holder for any items found in the directory in which the for loop is run that matches file*

which when done:

[email protected]:~# cat a_new_directory/file1_20
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 

update (to grab 20% of lines):

To grab the first approximate 20% of lines we could replace stat --printf %s "$f" with:

wc -l < "$f"

Since we are using printf and bc we can effectively round up from .5, however if a file is only 1 or 2 lines long it will miss them. So we would want to not only round up, but default to at least grabbing 1 line.

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章