码创造者

这个屌丝很懒，什么也没留下！

热门标签

[Python] 读取一个xlsx文件处理并保存成txt文件或binary_python实现xlsx表转换txt

作者：码创造者 | 2024-07-13 01:22:30

踩

python实现xlsx表转换txt

最近拿到一个上游的xlsx, 需要自己加工处理取到自己想要的值，需要注意的知识点做个记录：

以下内容基于以下假设：
1, 文件名字为Data_A.xlsx和Data_B.xlsx, 其内容格式为：

A	B	D E	F
0x101	10	1	1024
0x201	11	1	117
0x301	12	3	31
0x401	13	2	41
0x501	14	1	126
0x601	15	4	13

0 主要程序框架

def clean():
        current_path = os.getcwd()
        for infile in glob.glob( os.path.join(current_path, '*.csv') ):
            os.remove(infile)

if __name__ == '__main__':
    clean()
    sys.exit()
1
2
3
4
5
6
7
8

1 从用户输入的cmd line读取xlsx文件

def parse_input(argv): 
    inputfile = '' 
    try: 
        opts, args = getopt.getopt(sys.argv[1:],"hi:",["help","ifile="])
    except getopt.GetoptError:
        sys.exit(2)
    for opt, arg in opts:
        if opt in ("-h","--help"):
            print('python3 parse_xlsx.py -i <inputfile>')
            sys.exit()
        elif opt in ("-i", "--ifile"): 
            inputfile = arg
    return inputfile
1
2
3
4
5
6
7
8
9
10
11
12
13

2 读取xlsx文件为`raw.csv`文件

def parse_xlsx(file):
    current_path = os.getcwd()
    filepath = current_path+'\\new.csv'
    if (os.path.exists(filepath)) :
        os.remove(filepath)

    rawdf = pd.read_excel(file, sheet_name=0)
    DataFrame(rawdf).to_csv('raw.csv', index=False, header=0)
1
2
3
4
5
6
7
8

3 读取`raw.csv`文件

newdf = pd.read_csv('raw.csv', sep=',', header=0, names=None)
1

4 将`D E`列进行处理：以此列的值作为排序依据先进行排序

newdf = pd.read_csv('raw.csv', sep=',', header=0, names=None)

newdf_temp = newdf
if any(newdf_temp['D E'] == 2):
	newdf_order_1 = newdf_temp.loc[newdf_temp['D E'] == 1]
	newdf_order_2 = newdf_temp.loc[newdf_temp['D E'] == 2]
	newdf = newdf_order_1._append(newdf_order_2, ignore_index=True)
	newdf_order_3 = newdf_temp.loc[newdf_temp['D E'] == 3]
	newdf = newdf._append(newdf_order_3, ignore_index=True)
	newdf_order_4 = newdf_temp.loc[newdf_temp['D E'] == 4]
	newdf = newdf._append(newdf_order_4, ignore_index=True)
1
2
3
4
5
6
7
8
9
10
11

5 将`A`列进行处理：去掉`0x`, 并按8位对齐

newdf['a'] = newdf['A'].astype('str') # the value of A column will handled as string
newdf['a'] = newdf['a'].str.replace(r"0x", "")
newdf_a = newdf['a'].str.zfill(8)
1
2
3

6 将`B`列进行处理：转成十六进制数, 并按8位对齐

newdf['b'] = newdf['B'].astype('int')
newdf['b'] = newdf['b'].apply( hex )
newdf['b'] = newdf['b'].str.replace(r"0x", "")
newdf_b = newdf['b'].str.zfill(8)
1
2
3
4

7 将`C`列进行处理：全部为NaN, 忽略

8 将`F`列进行处理：转成十六进制数, 并按8位对齐

	newdf['f'] = newdf['F'].astype('int')
	newdf['f'] = newdf['f'].apply( hex )
	newdf['f'] = newdf['f'].str.replace(r"0x", "")
	newdf_f = newdf['f'].str.zfill(8)
1
2
3
4

9 把上面处理完的`A`、`B`、`F`列进行拼接，结果存进`new.csv`

此处要注意拼接的前后顺序，注意是列拼接

newdf = newdf_a + newdf_b + newdf_f
DataFrame(newdf).to_csv('new.csv', index=False, header=False)
1
2

10 区分`A`和`B`两个文件

def distinguish(file):
	if "a".upper() in file.upper():
	    bfile = "A.bin"
	elif "b".upper() in file.upper():
	    bfile = "B.bin"
	else:
	    print("Please confirm input is a or b")
1
2
3
4
5
6
7

11 将传入的数据每8位作为一组，进行大小端转化

def byteorder_convert(para):
    little_hex = bytearray.fromhex(para)
    i =0
    tmp = bytearray()
    result = bytearray()
    while i < len(little_hex):
        tmp = little_hex[i:i+4]
        tmp.reverse()
        result = result + tmp
        # print(result)
        i += 4
    str_little = "".join(format(x, "02x") for x in result)
    return str_little
1
2
3
4
5
6
7
8
9
10
11
12
13

12.a 对每一行进行处理，结果存进`bfile`中

def binary(bfile):
    with open('new.csv', 'r') as f:
        reader = csv.reader(f)
        # print(type(reader))
        file = open(bfile, "wb")
        for row in reader:
            row = byteorder_convert(''.join(row))
            # print(row)
            result = ''
            t = '\\x'
            i = 0
            if len(row)%2 == 1:
                print('String length illegal')
            while i < len(row):
                result = result + t + row[i:i+2]
                i = i+2
                s = result
                b = eval('b' + '\'' + s + '\'')
            # print(result)
            # print(b)
            file.write(bytes(b))
        file.close()
        print("Done! Please check", bfile)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

12.b 对每一行进行处理，结果存进`bfile`中

def array(file, newdf_a, newdf_b, newdf_f):
    if "a".upper() in file.upper():
        afile = "new_a.c"
    elif "b".upper() in file.upper():
        afile = "new_b.c"
    else:
        print("Please confirm input is a or b")

	newdf = "{"  + ".a = "+ newdf_a + ", .b = " + newdf_b + ", .f = " + newdf_f + "}"
	DataFrame(newdf).to_csv('new.csv', index=False, header=False)

    newdf = pd.read_csv('new.csv', sep=',', header=0, names=None)
    # print(newdf)
    DataFrame(newdf).to_csv(afile, sep=' ', index=False, header=False, quoting=csv.QUOTE_NONE, escapechar=' ')
    print("Done! Please check", afile)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

13 删除过程中的`csv`中间文件

def clean():
        current_path = os.getcwd()
        for infile in glob.glob( os.path.join(current_path, '*.csv') ):
            os.remove(infile)
1
2
3
4

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/码创造者/article/detail/817002

[Python] 读取一个xlsx文件处理并保存成txt文件或binary_python实现xlsx表转换txt

0 主要程序框架

1 从用户输入的cmd line读取xlsx文件

2 读取xlsx文件为raw.csv文件

3 读取raw.csv文件

4 将D E列进行处理：以此列的值作为排序依据先进行排序

5 将A列进行处理：去掉0x, 并按8位对齐

6 将B列进行处理：转成十六进制数, 并按8位对齐

7 将C列进行处理：全部为NaN, 忽略

8 将F列进行处理：转成十六进制数, 并按8位对齐

9 把上面处理完的A、B、F列进行拼接，结果存进new.csv

10 区分A和B两个文件