Linux mapfile命令：如何将标准输入或大量的文件内容读入到数组中（附实例教程和注意事项）-Linux入门自学网

Linux中的mapfile命令，也称作readarray，是一个强大的内建命令，用于从标准输入或文件中读取数据并将其直接映射到数组中。这个命令非常适合处理脚本中的大量数据，能够简化数组的创建和数据的读取过程。

Linux mapfile命令介绍

mapfile（全称：mapfile 或 readarray），用于将标准输入的行或文件内容直接读入到数组变量中。它允许用户指定各种选项来控制读入数据的行为，包括行分隔符的处理、最大行数限制等。

Linux mapfile命令适用的Linux版本

mapfile命令是在Bash shell中提供的命令，大多数现代Linux发行版，如Ubuntu, Fedora, CentOS, Debian等都支持这个命令，因为它们默认使用Bash作为标准shell。在较旧的发行版或默认不使用Bash的环境下，可能需要安装或升级Bash。

对于CentOS 7和CentOS 8，可以使用如下命令安装Bash：

CentOS 7:

[linux@bashcommandnotfound.cn ~]$ sudo yum install bash

CentOS 8:

[linux@bashcommandnotfound.cn ~]$ sudo dnf install bash

如果系统中不存在mapfile命令，将会出现bash: mapfile: command not found的错误。这种情况下，按照上述指示进行安装即可。

Linux mapfile命令的基本语法

mapfile [-d delim] [-n count] [-O origin] [-s count] [-t] [-u fd] [-C callback] [-c quantum] [array]

Linux mapfile命令的常用选项或参数说明

选项	说明
`-d`	指定行分隔符，默认为换行符
`-n`	限制读取的最大行数
`-O`	指定数组开始填充的索引
`-s`	跳过文件中的前N行
`-t`	从每行输入中删除行尾的换行符
`-u`	从文件描述符读取输入，默认为标准输入
`-C`	每读取quantum行后，执行callback指定的命令
`-c`	与-C一起使用，设置执行回调的行数

若选项多于20个，则优先列举上述最常用的选项。

Linux mapfile命令的实例

实例1：读取文件内容到数组

[linux@bashcommandnotfound.cn ~]$ mapfile -t my_array < filename.txt

这个命令会读取filename.txt文件的内容，并将其存入my_array数组中，每行为数组的一个元素。

实例2：读取特定行数

[linux@bashcommandnotfound.cn ~]$ mapfile -n 10 -t my_array < filename.txt

这将只读取filename.txt文件的前10行。

实例3：从指定文件描述符读取

[linux@bashcommandnotfound.cn ~]$ exec 3< filename.txt
[linux@bashcommandnotfound.cn ~]$ mapfile -u 3 -t my_array
[linux@bashcommandnotfound.cn ~]$ exec 3<&-

此实例展示了如何从文件描述符3读取数据到数组。

实例4：跳过文件的前N行

[linux@bashcommandnotfound.cn ~]$ mapfile -s 2 -t my_array < filename.txt

如果文件的前两行是标题或不需要的信息，这个命令将跳过filename.txt文件的前两行，然后将剩余行读取到数组中。

实例5：结合回调函数使用mapfile

[linux@bashcommandnotfound.cn ~]$ mapfile -c 2 -C callback_function -t my_array < filename.txt

每读取两行，将执行callback_function函数。

实例6：使用自定义行分隔符

[linux@bashcommandnotfound.cn ~]$ mapfile -d ':' -t my_array < filename.txt

这将使用冒号:作为行分隔符来读取filename.txt文件。

实例7：从标准输入读取数据到数组

[linux@bashcommandnotfound.cn ~]$ cat filename.txt | mapfile -t my_array

这将允许你使用管道从其他命令输出中直接读取数据到数组。

实例8：读取数据到数组，并指定起始索引

[linux@bashcommandnotfound.cn ~]$ mapfile -O 3 -t my_array < filename.txt

数据会从数组的第三个索引开始存储。

实例9：结合循环处理数组数据

[linux@bashcommandnotfound.cn ~]$ mapfile -t my_array < filename.txt
[linux@bashcommandnotfound.cn ~]$ for line in "${my_array[@]}"; do echo $line; done

读取文件到数组后，遍历数组中的每个元素并打印。

实例10：读取数据并限制数组大小

[linux@bashcommandnotfound.cn ~]$ mapfile -n 5 -t my_array < filename.txt
[linux@bashcommandnotfound.cn ~]$ echo "${#my_array[@]}"

只读取文件的前5行，并显示数组的大小。

实例11：使用mapfile处理多行输入

[linux@bashcommandnotfound.cn ~]$ printf "line1\nline2\nline3" | mapfile -t my_array
[linux@bashcommandnotfound.cn ~]$ echo "${my_array[1]}"

这将处理多行字符串，输出数组中的第二个元素，即line2。

实例12：读取数据并应用回调函数

[linux@bashcommandnotfound.cn ~]$ mapfile -c 1 -C 'echo "Read line: $REPLY"' -t my_array < filename.txt

对于文件中的每一行，读取时都会执行回调函数，打印当前读取的行。

实例13：使用mapfile与关联数组

如果你需要将文件中的数据读取到关联数组中，可以先使用mapfile读取到一个普通数组，然后再转换。注意，Bash的关联数组需要另外声明和处理。

实例14：过滤空行

[linux@bashcommandnotfound.cn ~]$ grep -v '^$' filename.txt | mapfile -t my_array

先使用grep移除空行，然后将结果读入数组。

实例15：读取特定模式的行

[linux@bashcommandnotfound.cn ~]$ grep 'pattern' filename.txt | mapfile -t my_array

这样只会读取包含特定模式pattern的行到数组。

常见技巧或高级技巧

高级技巧1：结合IFS使用mapfile

IFS（Internal Field Separator）是Bash中用于确定如何进行单词分割和行解析的环境变量。默认情况下，IFS值包括空格、Tab和换行符，但是你可以自定义IFS的值来改变行为。

例如，如果你想使用逗号分隔的值来填充数组，可以这样做：

IFS=',' read -r -a my_array <<< "value1,value2,value3"

在这个例子中，字符串"value1,value2,value3"会被分割成三个元素，并存储在my_array数组中。

高级技巧2：处理大型文件

如果你正在处理非常大的文件，并且只想处理部分数据，或者需要分批处理数据以节省内存，可以使用-n和-s参数。

例如，你可以这样处理一个包含数百万行的大文件：

# 处理前10000行
mapfile -n 10000 -t my_array < hugefile.txt

# 接着处理下一个10000行
mapfile -n 10000 -s 10000 -t my_array < hugefile.txt

通过更改-s选项的值，你可以在文件中向前移动，分批处理文件的不同部分。

关于mapfile的注意事项

mapfile命令不适用于处理二进制文件，因为它是按行读取数据的。
当数组名没有提供时，mapfile会默认将读取的数据赋值给名为MAPFILE的数组。
mapfile不会自动处理文件中的引号或转义字符。如果你的文件包含这些特殊字符，并且希望它们被正确处理，你需要在读取之前或之后进行额外的处理。

目录CONTENT

Linux mapfile命令：如何将标准输入或大量的文件内容读入到数组中（附实例教程和注意事项）