Python/Pandas- File extraction/parsing
Python/Pandas- File extraction/parsing
requirement
table format
Join condition
If first 5 characters of the starting of the line is ‘C4305’ – Write to a dataframe(df) by spliting the data into the columns A, B, C, D….
Column B is unique
If first 5 characters of the starting of the line is ‘C4306’ – Write to a dataframe(df2) into the columns A1, B1, C1, D1….
No match here so create new column which takes df[column B] values from before line(acts as a foreign key)
If first 5 characters of the starting of the line is ‘C4307’ – Write to a dataframe(df3) into the columns A2, B2, C2, D2….
No match here so create new column which takes df[column B] values from before line(acts as a foreign key)
They can be considered as parent child relationship
Each time the first 3 characters is ‘C42’ it is a new record
sample_output
df1
A
B(PrimaryKey)
C
D
E
F
G
H
C4305710235
5012
JDHGFG
10K
VALLEY
0
0
0
C4305234525
2046
JDHGFG
10K
VALLEY
0
0
0
C4305745576
7015
JDHGFG
10K
VALLEY
0
0
0
df2
A
B
C
D
E
F
G
Z(Foreign Key)
C4306710235
JDHGFG
10K
VALLEY
0
0
0
5012
C4306710235
JDHGFG
10K
VALLEY
0
0
0
5012
C4306710235
JDHGFG
10K
VALLEY
0
0
0
5012
C4306710235
JDHGFG
10K
VALLEY
0
0
0
5012
C4306234525
JDHGFG
10K
VALLEY
0
0
0
2046
C4306234525
JDHGFG
10K
VALLEY
0
0
0
2046
C4306745576
JDHGFG
10K
VALLEY
0
0
0
7015
C4306745576
JDHGFG
10K
VALLEY
0
0
0
7015
C4306745576
JDHGFG
10K
VALLEY
0
0
0
7015
df3
A
B
C
D
E
F
G
Z(Foreign Key)
C4307710235
JDHGFG
10K
VALLEY
0
0
0
5012
C4307234525
JDHGFG
10K
VALLEY
0
0
0
2046
"You need a similar assignment done from scratch? Our qualified writers will help you with a guaranteed AI-free & plagiarism-free A+ quality paper, Confidentiality, Timely delivery & Livechat/phone Support.
Discount Code: CIPD30
Click ORDER NOW..


