Split text into multiple lines based on pipe and cap delimiter - Oracle PL/SQL Pipelined FunctionHow to concatenate text from multiple rows into a single text string in SQL server?Split Strings into words with multiple word boundary delimitersOracle SQL, concatenate multiple columns + add textSplit string with multiple delimiters in Pythonsplitting a string based on multiple char delimitersSplit one file into multiple files based on delimiterSplitting string into multiple rows in OracleSplit String by delimiter position using oracle SQLOracle 11gR2: split string with multiple delimiters(add)Oracle SQL splitting single rows into multiples by delimiter

How to model the curly cable part of the phone

On which topic did Indiana Jones write his doctoral thesis?

How can I close a gap between my fence and my neighbor's that's on his side of the property line?

Why was the battle set up *outside* Winterfell?

How can I get a job without pushing my family's income into a higher tax bracket?

Can a nothic's Weird Insight action discover secrets about a player character that the character doesn't know about themselves?

I'm in your subnets, golfing your code

Are there any Final Fantasy Spirits in Super Smash Bros Ultimate?

Fill points into a pre-rotated convex Dodecahedron

What is the most remote airport from the center of the city it supposedly serves?

How do LIGO and VIRGO know that a gravitational wave has its origin in a neutron star or a black hole?

As matter approaches a black hole, does it speed up?

Does a card have a keyword if it has the same effect as said keyword?

Manager is threatening to grade me poorly if I don't complete the project

If your medical expenses exceed your income does the IRS pay you?

How long would it take for people to notice a mass disappearance?

Expressing 'our' for objects belonging to our apartment

Building a list of products from the elements in another list

Why do money exchangers give different rates to different bills?

How does this change to the opportunity attack rule impact combat?

Why is B♯ higher than C♭ in 31-ET?

What are the differences between credential stuffing and password spraying?

Timing of New Changes in Change Sets

Have I damaged my car by attempting to reverse with hand/park brake up?

Split text into multiple lines based on pipe and cap delimiter - Oracle PL/SQL Pipelined Function

How to concatenate text from multiple rows into a single text string in SQL server?Split Strings into words with multiple word boundary delimitersOracle SQL, concatenate multiple columns + add textSplit string with multiple delimiters in Pythonsplitting a string based on multiple char delimitersSplit one file into multiple files based on delimiterSplitting string into multiple rows in OracleSplit String by delimiter position using oracle SQLOracle 11gR2: split string with multiple delimiters(add)Oracle SQL splitting single rows into multiples by delimiter

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I have a table:

 CREATE TABLE "text_file"
( "SEQ" NUMBER,
"SPLIT_VALUE" CLOB
)

The content of the table is:

SEQ SPLIT_VALUE
1 MSH|^~&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
 PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
 OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
 OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
 OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105

2 MSH|^~&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
 PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
 OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
 OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
 OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105

Please note - the possible segment like MSH, OBR, OBX, LX can be 3 character or 2 characters. So, best way would be to get the segment name before the first pipe.

I am looking to split the string in split_value into multiple rows in the following conditions:

SEQ -- it would pick from the first column

SPLIT_SEQ -- it would split based on the first word before |, for ex. MSH, OBR, OBX, LX followed by sequence starting from 00. If there is a cap ^, then it would break down even further, for ex. MSH08-01, MSH08-02

Please note - there is an exception for segment MSH. For MSH - first
element is | and second one is ^~&

SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3

SEG_SEQ -- if the segment, the first word before | is repeated in the same SEQ, then increase it. So, if OBX is twice, then first OBX values would be 1 and for second OBX, it would be 2 and so on

SPLIT_SEQ_VALUE -- The value from the message above would be specified here.

Please note - I have around 90,000 rows in text_file table. So the solution should be able to process 90,000 efficiently.

The complete output is:

SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3
1 MSH05 1 GHH OE
1 MSH06 1 BLDG4
1 MSH07 1 200202150930
1 MSH08 1 
1 MSH09-01 1 ORU
1 MSH09-02 1 R01
1 PID00 1 PID
1 PID01 1 
1 PID02 1 
1 PID03 1 555-44-4444
1 PID04 1 
1 PID05-01 1 EVERYWOMAN
1 PID05-02 1 EVE
1 PID05-03 1 E
1 PID05-04 1 
1 PID05-05 1 
1 PID05-06 1 
1 PID05-07 1 L
1 PID06 1 JONES
1 PID07 1 19620320
1 PID08 1 F
1 PID09 1 
1 PID10 1 
1 PID11-01 1 153 FERNWOOD DR.
1 PID11-02 1 
1 PID11-03 1 STATESVILLE
1 PID11-04 1 OH
1 PID11-05 1 35292
1 PID12 1 
1 OBR00 1 OBR
1 OBR01 1 1
1 OBR02-01 1 845439
1 OBR02-02 1 GHH OE
1 OBR03-01 1 1045813
1 OBR03-02 1 GHH LAB
1 OBR04-01 1 15545
1 OBR04-02 1 GLUCOSE
1 OBR05 1 
1 OBR06 1 
1 OBR07 1 200202150730
1 OBX00 1 OBX
1 OBX01 1 1
1 OBX02 1 SN
1 OBX03-01 1 1554-5
1 OBX03-02 1 GLUCOSE
1 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 1 
1 OBX05-01 1 
1 OBX05-02 1 182
1 OBX06 1 mg/dl
1 OBX07 1 70_105
1 OBX00 2 OBX
1 OBX01 2 1
1 OBX02 2 SN
1 OBX03-01 2 1554-5
1 OBX03-02 2 GLUCOSE
1 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 2 
1 OBX05-01 2 
1 OBX05-02 2 182
1 OBX06 2 mg/dl
1 OBX07 2 70_105

2 MSH00 1 MSH
2 MSH01 1 |
2 MSH02 1 ^~&
2 MSH03 1 GHH LAB
2 MSH04 1 ELAB-3
2 MSH05 1 GHH OE
2 MSH06 1 BLDG4
2 MSH07 1 200202150930
2 MSH08 1 
2 MSH09-01 1 ORU
2 MSH09-02 1 R01
2 PID00 1 PID
2 PID01 1 
2 PID02 1 
2 PID03 1 555-44-4444
2 PID04 1 
2 PID05-01 1 EVERYWOMAN
2 PID05-02 1 EVE
2 PID05-03 1 E
2 PID05-04 1 
2 PID05-05 1 
2 PID05-06 1 
2 PID05-07 1 L
2 PID06 1 JONES
2 PID07 1 19620320
2 PID08 1 F
2 PID09 1 
2 PID10 1 
2 PID11-01 1 153 FERNWOOD DR.
2 PID11-02 1 
2 PID11-03 1 STATESVILLE
2 PID11-04 1 OH
2 PID11-05 1 35292
2 PID12 1 
2 OBR00 1 OBR
2 OBR01 1 1
2 OBR02-01 1 845439
2 OBR02-02 1 GHH OE
2 OBR03-01 1 1045813
2 OBR03-02 1 GHH LAB
2 OBR04-01 1 15545
2 OBR04-02 1 GLUCOSE
2 OBR05 1 
2 OBR06 1 
2 OBR07 1 200202150730
2 OBX00 1 OBX
2 OBX01 1 1
2 OBX02 1 SN
2 OBX03-01 1 1554-5
2 OBX03-02 1 GLUCOSE
2 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 1 
2 OBX05-01 1 
2 OBX05-02 1 182
2 OBX06 1 mg/dl
2 OBX07 1 70_105
2 OBX00 2 OBX
2 OBX01 2 1
2 OBX02 2 SN
2 OBX03-01 2 1554-5
2 OBX03-02 2 GLUCOSE
2 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 2 
2 OBX05-01 2 
2 OBX05-02 2 182
2 OBX06 2 mg/dl
2 OBX07 2 70_105

I believe that in as plsql pipelined function would be the best way.

Any help would be appreciated.

edited Mar 18 at 17:11

asked Mar 15 at 8:24

dang

63212145

2

Didn't you ask this question yesterday?

– jarlh
Mar 15 at 8:28

That was a different question. I have to do a lot of string manipulation in Oracle and I have no idea how to do it. So, I am trying to learn using different techniques.

– dang
Mar 15 at 8:44

what do you mean you are not sure if the statement would work. did you try out it or not? if you try to use search, you will already find a lot of questions and answers how to split a string by delimeter.

– hotfix
Mar 15 at 9:22

5

This is an HL7 format file commonly used to transfer healthcare data. It has a defined layout that you need to be aware of when consuming the data within. It can have sub sections that repeat, etc. i.e. the second pipe-delimited element after the MSH which looks like a jumble of characters (^~&) actually defines the delimiters usind for the sub-elements, etc. Make sure you are carefully following the HL7 spec when extracting data!

– Gary_W
Mar 18 at 18:41

1

@dang Do a little searching for HL7 parser, this has been around for ages and is typically done in an ETL tool. Check this out: hl7messageparser.azurewebsites.net/Parse/ParseView

– Gary_W
Mar 20 at 19:06

|
show 3 more comments

I have a table:

 CREATE TABLE "text_file"
( "SEQ" NUMBER,
"SPLIT_VALUE" CLOB
)

The content of the table is:

SEQ SPLIT_VALUE
1 MSH|^~&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
 PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
 OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
 OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
 OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105

2 MSH|^~&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
 PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
 OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
 OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
 OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105

Please note - the possible segment like MSH, OBR, OBX, LX can be 3 character or 2 characters. So, best way would be to get the segment name before the first pipe.

I am looking to split the string in split_value into multiple rows in the following conditions:

SEQ -- it would pick from the first column

SPLIT_SEQ -- it would split based on the first word before |, for ex. MSH, OBR, OBX, LX followed by sequence starting from 00. If there is a cap ^, then it would break down even further, for ex. MSH08-01, MSH08-02

Please note - there is an exception for segment MSH. For MSH - first
element is | and second one is ^~&

SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3

SEG_SEQ -- if the segment, the first word before | is repeated in the same SEQ, then increase it. So, if OBX is twice, then first OBX values would be 1 and for second OBX, it would be 2 and so on

SPLIT_SEQ_VALUE -- The value from the message above would be specified here.

Please note - I have around 90,000 rows in text_file table. So the solution should be able to process 90,000 efficiently.

The complete output is:

SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3
1 MSH05 1 GHH OE
1 MSH06 1 BLDG4
1 MSH07 1 200202150930
1 MSH08 1 
1 MSH09-01 1 ORU
1 MSH09-02 1 R01
1 PID00 1 PID
1 PID01 1 
1 PID02 1 
1 PID03 1 555-44-4444
1 PID04 1 
1 PID05-01 1 EVERYWOMAN
1 PID05-02 1 EVE
1 PID05-03 1 E
1 PID05-04 1 
1 PID05-05 1 
1 PID05-06 1 
1 PID05-07 1 L
1 PID06 1 JONES
1 PID07 1 19620320
1 PID08 1 F
1 PID09 1 
1 PID10 1 
1 PID11-01 1 153 FERNWOOD DR.
1 PID11-02 1 
1 PID11-03 1 STATESVILLE
1 PID11-04 1 OH
1 PID11-05 1 35292
1 PID12 1 
1 OBR00 1 OBR
1 OBR01 1 1
1 OBR02-01 1 845439
1 OBR02-02 1 GHH OE
1 OBR03-01 1 1045813
1 OBR03-02 1 GHH LAB
1 OBR04-01 1 15545
1 OBR04-02 1 GLUCOSE
1 OBR05 1 
1 OBR06 1 
1 OBR07 1 200202150730
1 OBX00 1 OBX
1 OBX01 1 1
1 OBX02 1 SN
1 OBX03-01 1 1554-5
1 OBX03-02 1 GLUCOSE
1 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 1 
1 OBX05-01 1 
1 OBX05-02 1 182
1 OBX06 1 mg/dl
1 OBX07 1 70_105
1 OBX00 2 OBX
1 OBX01 2 1
1 OBX02 2 SN
1 OBX03-01 2 1554-5
1 OBX03-02 2 GLUCOSE
1 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 2 
1 OBX05-01 2 
1 OBX05-02 2 182
1 OBX06 2 mg/dl
1 OBX07 2 70_105

2 MSH00 1 MSH
2 MSH01 1 |
2 MSH02 1 ^~&
2 MSH03 1 GHH LAB
2 MSH04 1 ELAB-3
2 MSH05 1 GHH OE
2 MSH06 1 BLDG4
2 MSH07 1 200202150930
2 MSH08 1 
2 MSH09-01 1 ORU
2 MSH09-02 1 R01
2 PID00 1 PID
2 PID01 1 
2 PID02 1 
2 PID03 1 555-44-4444
2 PID04 1 
2 PID05-01 1 EVERYWOMAN
2 PID05-02 1 EVE
2 PID05-03 1 E
2 PID05-04 1 
2 PID05-05 1 
2 PID05-06 1 
2 PID05-07 1 L
2 PID06 1 JONES
2 PID07 1 19620320
2 PID08 1 F
2 PID09 1 
2 PID10 1 
2 PID11-01 1 153 FERNWOOD DR.
2 PID11-02 1 
2 PID11-03 1 STATESVILLE
2 PID11-04 1 OH
2 PID11-05 1 35292
2 PID12 1 
2 OBR00 1 OBR
2 OBR01 1 1
2 OBR02-01 1 845439
2 OBR02-02 1 GHH OE
2 OBR03-01 1 1045813
2 OBR03-02 1 GHH LAB
2 OBR04-01 1 15545
2 OBR04-02 1 GLUCOSE
2 OBR05 1 
2 OBR06 1 
2 OBR07 1 200202150730
2 OBX00 1 OBX
2 OBX01 1 1
2 OBX02 1 SN
2 OBX03-01 1 1554-5
2 OBX03-02 1 GLUCOSE
2 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 1 
2 OBX05-01 1 
2 OBX05-02 1 182
2 OBX06 1 mg/dl
2 OBX07 1 70_105
2 OBX00 2 OBX
2 OBX01 2 1
2 OBX02 2 SN
2 OBX03-01 2 1554-5
2 OBX03-02 2 GLUCOSE
2 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 2 
2 OBX05-01 2 
2 OBX05-02 2 182
2 OBX06 2 mg/dl
2 OBX07 2 70_105

I believe that in as plsql pipelined function would be the best way.

Any help would be appreciated.

edited Mar 18 at 17:11

asked Mar 15 at 8:24

dang

63212145

2

Didn't you ask this question yesterday?

– jarlh
Mar 15 at 8:28

That was a different question. I have to do a lot of string manipulation in Oracle and I have no idea how to do it. So, I am trying to learn using different techniques.

– dang
Mar 15 at 8:44

what do you mean you are not sure if the statement would work. did you try out it or not? if you try to use search, you will already find a lot of questions and answers how to split a string by delimeter.

– hotfix
Mar 15 at 9:22

5

This is an HL7 format file commonly used to transfer healthcare data. It has a defined layout that you need to be aware of when consuming the data within. It can have sub sections that repeat, etc. i.e. the second pipe-delimited element after the MSH which looks like a jumble of characters (^~&) actually defines the delimiters usind for the sub-elements, etc. Make sure you are carefully following the HL7 spec when extracting data!

– Gary_W
Mar 18 at 18:41

1

@dang Do a little searching for HL7 parser, this has been around for ages and is typically done in an ETL tool. Check this out: hl7messageparser.azurewebsites.net/Parse/ParseView

– Gary_W
Mar 20 at 19:06

|
show 3 more comments

I have a table:

 CREATE TABLE "text_file"
( "SEQ" NUMBER,
"SPLIT_VALUE" CLOB
)

The content of the table is:

SEQ SPLIT_VALUE
1 MSH|^~&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
 PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
 OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
 OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
 OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105

2 MSH|^~&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
 PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
 OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
 OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
 OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105

Please note - the possible segment like MSH, OBR, OBX, LX can be 3 character or 2 characters. So, best way would be to get the segment name before the first pipe.

I am looking to split the string in split_value into multiple rows in the following conditions:

SEQ -- it would pick from the first column

SPLIT_SEQ -- it would split based on the first word before |, for ex. MSH, OBR, OBX, LX followed by sequence starting from 00. If there is a cap ^, then it would break down even further, for ex. MSH08-01, MSH08-02

Please note - there is an exception for segment MSH. For MSH - first
element is | and second one is ^~&

SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3

SEG_SEQ -- if the segment, the first word before | is repeated in the same SEQ, then increase it. So, if OBX is twice, then first OBX values would be 1 and for second OBX, it would be 2 and so on

SPLIT_SEQ_VALUE -- The value from the message above would be specified here.

Please note - I have around 90,000 rows in text_file table. So the solution should be able to process 90,000 efficiently.

The complete output is:

SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3
1 MSH05 1 GHH OE
1 MSH06 1 BLDG4
1 MSH07 1 200202150930
1 MSH08 1 
1 MSH09-01 1 ORU
1 MSH09-02 1 R01
1 PID00 1 PID
1 PID01 1 
1 PID02 1 
1 PID03 1 555-44-4444
1 PID04 1 
1 PID05-01 1 EVERYWOMAN
1 PID05-02 1 EVE
1 PID05-03 1 E
1 PID05-04 1 
1 PID05-05 1 
1 PID05-06 1 
1 PID05-07 1 L
1 PID06 1 JONES
1 PID07 1 19620320
1 PID08 1 F
1 PID09 1 
1 PID10 1 
1 PID11-01 1 153 FERNWOOD DR.
1 PID11-02 1 
1 PID11-03 1 STATESVILLE
1 PID11-04 1 OH
1 PID11-05 1 35292
1 PID12 1 
1 OBR00 1 OBR
1 OBR01 1 1
1 OBR02-01 1 845439
1 OBR02-02 1 GHH OE
1 OBR03-01 1 1045813
1 OBR03-02 1 GHH LAB
1 OBR04-01 1 15545
1 OBR04-02 1 GLUCOSE
1 OBR05 1 
1 OBR06 1 
1 OBR07 1 200202150730
1 OBX00 1 OBX
1 OBX01 1 1
1 OBX02 1 SN
1 OBX03-01 1 1554-5
1 OBX03-02 1 GLUCOSE
1 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 1 
1 OBX05-01 1 
1 OBX05-02 1 182
1 OBX06 1 mg/dl
1 OBX07 1 70_105
1 OBX00 2 OBX
1 OBX01 2 1
1 OBX02 2 SN
1 OBX03-01 2 1554-5
1 OBX03-02 2 GLUCOSE
1 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 2 
1 OBX05-01 2 
1 OBX05-02 2 182
1 OBX06 2 mg/dl
1 OBX07 2 70_105

2 MSH00 1 MSH
2 MSH01 1 |
2 MSH02 1 ^~&
2 MSH03 1 GHH LAB
2 MSH04 1 ELAB-3
2 MSH05 1 GHH OE
2 MSH06 1 BLDG4
2 MSH07 1 200202150930
2 MSH08 1 
2 MSH09-01 1 ORU
2 MSH09-02 1 R01
2 PID00 1 PID
2 PID01 1 
2 PID02 1 
2 PID03 1 555-44-4444
2 PID04 1 
2 PID05-01 1 EVERYWOMAN
2 PID05-02 1 EVE
2 PID05-03 1 E
2 PID05-04 1 
2 PID05-05 1 
2 PID05-06 1 
2 PID05-07 1 L
2 PID06 1 JONES
2 PID07 1 19620320
2 PID08 1 F
2 PID09 1 
2 PID10 1 
2 PID11-01 1 153 FERNWOOD DR.
2 PID11-02 1 
2 PID11-03 1 STATESVILLE
2 PID11-04 1 OH
2 PID11-05 1 35292
2 PID12 1 
2 OBR00 1 OBR
2 OBR01 1 1
2 OBR02-01 1 845439
2 OBR02-02 1 GHH OE
2 OBR03-01 1 1045813
2 OBR03-02 1 GHH LAB
2 OBR04-01 1 15545
2 OBR04-02 1 GLUCOSE
2 OBR05 1 
2 OBR06 1 
2 OBR07 1 200202150730
2 OBX00 1 OBX
2 OBX01 1 1
2 OBX02 1 SN
2 OBX03-01 1 1554-5
2 OBX03-02 1 GLUCOSE
2 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 1 
2 OBX05-01 1 
2 OBX05-02 1 182
2 OBX06 1 mg/dl
2 OBX07 1 70_105
2 OBX00 2 OBX
2 OBX01 2 1
2 OBX02 2 SN
2 OBX03-01 2 1554-5
2 OBX03-02 2 GLUCOSE
2 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 2 
2 OBX05-01 2 
2 OBX05-02 2 182
2 OBX06 2 mg/dl
2 OBX07 2 70_105

I believe that in as plsql pipelined function would be the best way.

Any help would be appreciated.

edited Mar 18 at 17:11

asked Mar 15 at 8:24

dang

63212145

I have a table:

 CREATE TABLE "text_file"
( "SEQ" NUMBER,
"SPLIT_VALUE" CLOB
)

The content of the table is:

SEQ SPLIT_VALUE
1 MSH|^~&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
 PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
 OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
 OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
 OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105

2 MSH|^~&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
 PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
 OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
 OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
 OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105

Please note - the possible segment like MSH, OBR, OBX, LX can be 3 character or 2 characters. So, best way would be to get the segment name before the first pipe.

I am looking to split the string in split_value into multiple rows in the following conditions:

SEQ -- it would pick from the first column

SPLIT_SEQ -- it would split based on the first word before |, for ex. MSH, OBR, OBX, LX followed by sequence starting from 00. If there is a cap ^, then it would break down even further, for ex. MSH08-01, MSH08-02

Please note - there is an exception for segment MSH. For MSH - first
element is | and second one is ^~&

SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3

SEG_SEQ -- if the segment, the first word before | is repeated in the same SEQ, then increase it. So, if OBX is twice, then first OBX values would be 1 and for second OBX, it would be 2 and so on

SPLIT_SEQ_VALUE -- The value from the message above would be specified here.

Please note - I have around 90,000 rows in text_file table. So the solution should be able to process 90,000 efficiently.

The complete output is:

SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3
1 MSH05 1 GHH OE
1 MSH06 1 BLDG4
1 MSH07 1 200202150930
1 MSH08 1 
1 MSH09-01 1 ORU
1 MSH09-02 1 R01
1 PID00 1 PID
1 PID01 1 
1 PID02 1 
1 PID03 1 555-44-4444
1 PID04 1 
1 PID05-01 1 EVERYWOMAN
1 PID05-02 1 EVE
1 PID05-03 1 E
1 PID05-04 1 
1 PID05-05 1 
1 PID05-06 1 
1 PID05-07 1 L
1 PID06 1 JONES
1 PID07 1 19620320
1 PID08 1 F
1 PID09 1 
1 PID10 1 
1 PID11-01 1 153 FERNWOOD DR.
1 PID11-02 1 
1 PID11-03 1 STATESVILLE
1 PID11-04 1 OH
1 PID11-05 1 35292
1 PID12 1 
1 OBR00 1 OBR
1 OBR01 1 1
1 OBR02-01 1 845439
1 OBR02-02 1 GHH OE
1 OBR03-01 1 1045813
1 OBR03-02 1 GHH LAB
1 OBR04-01 1 15545
1 OBR04-02 1 GLUCOSE
1 OBR05 1 
1 OBR06 1 
1 OBR07 1 200202150730
1 OBX00 1 OBX
1 OBX01 1 1
1 OBX02 1 SN
1 OBX03-01 1 1554-5
1 OBX03-02 1 GLUCOSE
1 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 1 
1 OBX05-01 1 
1 OBX05-02 1 182
1 OBX06 1 mg/dl
1 OBX07 1 70_105
1 OBX00 2 OBX
1 OBX01 2 1
1 OBX02 2 SN
1 OBX03-01 2 1554-5
1 OBX03-02 2 GLUCOSE
1 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 2 
1 OBX05-01 2 
1 OBX05-02 2 182
1 OBX06 2 mg/dl
1 OBX07 2 70_105

2 MSH00 1 MSH
2 MSH01 1 |
2 MSH02 1 ^~&
2 MSH03 1 GHH LAB
2 MSH04 1 ELAB-3
2 MSH05 1 GHH OE
2 MSH06 1 BLDG4
2 MSH07 1 200202150930
2 MSH08 1 
2 MSH09-01 1 ORU
2 MSH09-02 1 R01
2 PID00 1 PID
2 PID01 1 
2 PID02 1 
2 PID03 1 555-44-4444
2 PID04 1 
2 PID05-01 1 EVERYWOMAN
2 PID05-02 1 EVE
2 PID05-03 1 E
2 PID05-04 1 
2 PID05-05 1 
2 PID05-06 1 
2 PID05-07 1 L
2 PID06 1 JONES
2 PID07 1 19620320
2 PID08 1 F
2 PID09 1 
2 PID10 1 
2 PID11-01 1 153 FERNWOOD DR.
2 PID11-02 1 
2 PID11-03 1 STATESVILLE
2 PID11-04 1 OH
2 PID11-05 1 35292
2 PID12 1 
2 OBR00 1 OBR
2 OBR01 1 1
2 OBR02-01 1 845439
2 OBR02-02 1 GHH OE
2 OBR03-01 1 1045813
2 OBR03-02 1 GHH LAB
2 OBR04-01 1 15545
2 OBR04-02 1 GLUCOSE
2 OBR05 1 
2 OBR06 1 
2 OBR07 1 200202150730
2 OBX00 1 OBX
2 OBX01 1 1
2 OBX02 1 SN
2 OBX03-01 1 1554-5
2 OBX03-02 1 GLUCOSE
2 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 1 
2 OBX05-01 1 
2 OBX05-02 1 182
2 OBX06 1 mg/dl
2 OBX07 1 70_105
2 OBX00 2 OBX
2 OBX01 2 1
2 OBX02 2 SN
2 OBX03-01 2 1554-5
2 OBX03-02 2 GLUCOSE
2 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 2 
2 OBX05-01 2 
2 OBX05-02 2 182
2 OBX06 2 mg/dl
2 OBX07 2 70_105

I believe that in as plsql pipelined function would be the best way.

Any help would be appreciated.

sql oracle split

edited Mar 18 at 17:11

asked Mar 15 at 8:24

dang

63212145

edited Mar 18 at 17:11

asked Mar 15 at 8:24

dang

63212145

edited Mar 18 at 17:11

asked Mar 15 at 8:24

dang

63212145

asked Mar 15 at 8:24

dang

63212145

asked Mar 15 at 8:24

dang

63212145

2

Didn't you ask this question yesterday?

– jarlh
Mar 15 at 8:28

That was a different question. I have to do a lot of string manipulation in Oracle and I have no idea how to do it. So, I am trying to learn using different techniques.

– dang
Mar 15 at 8:44

what do you mean you are not sure if the statement would work. did you try out it or not? if you try to use search, you will already find a lot of questions and answers how to split a string by delimeter.

– hotfix
Mar 15 at 9:22

5

This is an HL7 format file commonly used to transfer healthcare data. It has a defined layout that you need to be aware of when consuming the data within. It can have sub sections that repeat, etc. i.e. the second pipe-delimited element after the MSH which looks like a jumble of characters (^~&) actually defines the delimiters usind for the sub-elements, etc. Make sure you are carefully following the HL7 spec when extracting data!

– Gary_W
Mar 18 at 18:41

1

@dang Do a little searching for HL7 parser, this has been around for ages and is typically done in an ETL tool. Check this out: hl7messageparser.azurewebsites.net/Parse/ParseView

– Gary_W
Mar 20 at 19:06

|
show 3 more comments

2

Didn't you ask this question yesterday?

– jarlh
Mar 15 at 8:28

That was a different question. I have to do a lot of string manipulation in Oracle and I have no idea how to do it. So, I am trying to learn using different techniques.

– dang
Mar 15 at 8:44

what do you mean you are not sure if the statement would work. did you try out it or not? if you try to use search, you will already find a lot of questions and answers how to split a string by delimeter.

– hotfix
Mar 15 at 9:22

5

This is an HL7 format file commonly used to transfer healthcare data. It has a defined layout that you need to be aware of when consuming the data within. It can have sub sections that repeat, etc. i.e. the second pipe-delimited element after the MSH which looks like a jumble of characters (^~&) actually defines the delimiters usind for the sub-elements, etc. Make sure you are carefully following the HL7 spec when extracting data!

– Gary_W
Mar 18 at 18:41

1

@dang Do a little searching for HL7 parser, this has been around for ages and is typically done in an ETL tool. Check this out: hl7messageparser.azurewebsites.net/Parse/ParseView

– Gary_W
Mar 20 at 19:06

Didn't you ask this question yesterday?

– jarlh
Mar 15 at 8:28

That was a different question. I have to do a lot of string manipulation in Oracle and I have no idea how to do it. So, I am trying to learn using different techniques.

– dang
Mar 15 at 8:44

what do you mean you are not sure if the statement would work. did you try out it or not? if you try to use search, you will already find a lot of questions and answers how to split a string by delimeter.

– hotfix
Mar 15 at 9:22

This is an HL7 format file commonly used to transfer healthcare data. It has a defined layout that you need to be aware of when consuming the data within. It can have sub sections that repeat, etc. i.e. the second pipe-delimited element after the MSH which looks like a jumble of characters (^~&) actually defines the delimiters usind for the sub-elements, etc. Make sure you are carefully following the HL7 spec when extracting data!

– Gary_W
Mar 18 at 18:41

@dang Do a little searching for HL7 parser, this has been around for ages and is typically done in an ETL tool. Check this out: hl7messageparser.azurewebsites.net/Parse/ParseView

– Gary_W
Mar 20 at 19:06

|
show 3 more comments

1 Answer
1

active

oldest

votes

-1

It is PL/SQL and assuming your string can be of arbitrary length as well (i.e. more than 32K); you should use a table function along with dbms_lob package to parse it and then return multiple rows.

Blob Journey from Web to DB is a general article that shows how to manipulate blobs from web point of view. But approach there is the same. See the section around [Selecting Data]. This is simply splitting at 4000 bytes but your split logic will have to take into account the |. Idea is same though.

Then later on see the [table] usage along with PL/SQL

answered Mar 22 at 21:52

Saad Ahmad

1906

Can you show me the code for this one?

– dang
Mar 23 at 18:15

You can see the link. It has all the code

– Saad Ahmad
Mar 23 at 19:20

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55178365%2fsplit-text-into-multiple-lines-based-on-pipe-and-cap-delimiter-oracle-pl-sql-p%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

-1

It is PL/SQL and assuming your string can be of arbitrary length as well (i.e. more than 32K); you should use a table function along with dbms_lob package to parse it and then return multiple rows.

Then later on see the [table] usage along with PL/SQL

answered Mar 22 at 21:52

Saad Ahmad

1906

Can you show me the code for this one?

– dang
Mar 23 at 18:15

You can see the link. It has all the code

– Saad Ahmad
Mar 23 at 19:20

add a comment |

-1

It is PL/SQL and assuming your string can be of arbitrary length as well (i.e. more than 32K); you should use a table function along with dbms_lob package to parse it and then return multiple rows.

Then later on see the [table] usage along with PL/SQL

answered Mar 22 at 21:52

Saad Ahmad

1906

Can you show me the code for this one?

– dang
Mar 23 at 18:15

You can see the link. It has all the code

– Saad Ahmad
Mar 23 at 19:20

add a comment |

-1

It is PL/SQL and assuming your string can be of arbitrary length as well (i.e. more than 32K); you should use a table function along with dbms_lob package to parse it and then return multiple rows.

Then later on see the [table] usage along with PL/SQL

answered Mar 22 at 21:52

Saad Ahmad

1906

It is PL/SQL and assuming your string can be of arbitrary length as well (i.e. more than 32K); you should use a table function along with dbms_lob package to parse it and then return multiple rows.

Then later on see the [table] usage along with PL/SQL

answered Mar 22 at 21:52

Saad Ahmad

1906

answered Mar 22 at 21:52

Saad Ahmad

1906

answered Mar 22 at 21:52

Saad Ahmad

1906

answered Mar 22 at 21:52

Saad Ahmad

1906

Can you show me the code for this one?

– dang
Mar 23 at 18:15

You can see the link. It has all the code

– Saad Ahmad
Mar 23 at 19:20

add a comment |

Can you show me the code for this one?

– dang
Mar 23 at 18:15

You can see the link. It has all the code

– Saad Ahmad
Mar 23 at 19:20

Can you show me the code for this one?

– dang
Mar 23 at 18:15

You can see the link. It has all the code

– Saad Ahmad
Mar 23 at 19:20

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer
1

1 Answer
1

1 Answer
1