Skip to content
Advertisement

Removing backslash and newline character within one column

I’m getting text string html along with backslash and newline character. I can easily remove html tags using .replaceAll("<[^>]*>","") but still and new line character still exists. So, again i tried to replaceAll("\r\n|\r|\n","") but removing end of line characters.

Input String:

test1|test2|test3|test4|test5
testa|testB|testc|testd|teste
test11|test22|
<table cellpadding="0" cellspacing="0" id="master_tbl">
<tbody>
<tr id="master_cr">
<td>
<table cellpadding="0" cellspacing="0" id="master_DefaultContent_rts_s3801_tbl">
<tbody>
<tr id="master_DefaultContent_rts_s3801_cr">
<td>
<table cellpadding="0" cellspacing="0" id="master_DefaultContent_rts_s3801_ctl03" width="100%">
<tbody>
<tr>
<td><span id="master_DefaultContent_rts_s3801_f25914c">test33</span></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>|test44|test55
test66|test77|test88|test99|test00

Expected output string:

test1|test2|test3|test4|test5
testa|testB|testc|testd|teste
test11|test22|test33|test44|test55
test66|test77|test88|test99|test00

Advertisement

Answer

it’s a strange request and i’d fix it in place where you are generating input text

however seems each new-line that you want to remove prepended with a backslash

input.replaceAll("<[^>]*>","")
     .replaceAll("\\[\r\n]+","") // backslach+newline -> empty
     .replaceAll("\|[\r\n]+","|") // pipe+newline -> pipe
Advertisement