i trying parse site uses
<b>header</b>data<strong>header</strong>data
so have selector
.select("b, strong")
and try extract text between. - fine.
problem: site has eg.
<strong><strong>headerx</strong><br /></strong>data
now messes loops since text headerx twice, how can ignore nested strong?
update #1 solved, has better way.
elements selected = info.select("b, strong"); element next = selected.get(0); element = null; (int = 0; next != null ;i++) { = next; next = null; elements children = now.getallelements(); (;selected.size() > i; i++) { next = selected.get(i); if (!children.contains(next)) { break; } } //do whatever & next }
try this:
edit
info.select("b,strong").remove().text();