Python/Pandas- File extraction/parsing

Python/Pandas- File extraction/parsing

requirement

table format
Join condition

If first 5 characters of the starting of the line is ‘C4305’ – Write to a dataframe(df) by spliting the data into the columns A, B, C, D….
Column B is unique

If first 5 characters of the starting of the line is ‘C4306’ – Write to a dataframe(df2) into the columns A1, B1, C1, D1….
No match here so create new column which takes df[column B] values from before line(acts as a foreign key)

If first 5 characters of the starting of the line is ‘C4307’ – Write to a dataframe(df3) into the columns A2, B2, C2, D2….
No match here so create new column which takes df[column B] values from before line(acts as a foreign key)

They can be considered as parent child relationship

Each time the first 3 characters is ‘C42’ it is a new record

sample_output

df1

A
B(PrimaryKey)
C
D
E
F
G
H

C4305710235
5012
JDHGFG
10K
VALLEY
0
0
0

C4305234525
2046
JDHGFG
10K
VALLEY
0
0
0

C4305745576
7015
JDHGFG
10K
VALLEY
0
0
0

df2

A
B
C
D
E
F
G
Z(Foreign Key)

C4306710235
JDHGFG
10K
VALLEY
0
0
0
5012

C4306710235
JDHGFG
10K
VALLEY
0
0
0
5012

C4306710235
JDHGFG
10K
VALLEY
0
0
0
5012

C4306710235
JDHGFG
10K
VALLEY
0
0
0
5012

C4306234525
JDHGFG
10K
VALLEY
0
0
0
2046

C4306234525
JDHGFG
10K
VALLEY
0
0
0
2046

C4306745576
JDHGFG
10K
VALLEY
0
0
0
7015

C4306745576
JDHGFG
10K
VALLEY
0
0
0
7015

C4306745576
JDHGFG
10K
VALLEY
0
0
0
7015

df3

A
B
C
D
E
F
G
Z(Foreign Key)

C4307710235
JDHGFG
10K
VALLEY
0
0
0
5012

C4307234525
JDHGFG
10K
VALLEY
0
0
0
2046

"You need a similar assignment done from scratch? Our qualified writers will help you with a guaranteed AI-free & plagiarism-free A+ quality paper, Confidentiality, Timely delivery & Livechat/phone Support.


Discount Code: CIPD30



Click ORDER NOW..

order custom paper