i want write parser (as qi extension) can used via my_parser(p1, p2, ...)
p1, p2, ...
qi parser expressions.
actually, want implement best_match
parser works qi alternative, selects not first matching rule rule 'explains' of input.
given 2 rules simple_id = +(qi::alpha)
, complex_id = simple_id >> *(qi::string("::") > simple_id)
select complex_id
on input willy::anton
. , not produce intermediate attributes while doing so. these benefits payed runtime because lookahead parsing required.
in view there several use cases such parser construct. benchmark(p1, ...)
supporting parser optimization, 1 example. provide that, once know how that.
this parser non-terminal. tried (hard), can not find answers problem within qi. guess is, c++ mechanisms tightly integrated withc qi, there no understandable entry point, @ least me.
it quite easy implement want introducing operator. append current solution below. compiles expected incomplete.
a qi operator gets fusion::vector/sequence (whatever) ready made , acts on it. there seem no library offerings solve problem. make_nary_composite
expects args being compiled elements
.
i tried lot, failed, don't want bore that.
a workaround can imagine provide stateful operator ,
, make p1, ...
legal qi expression. implement unary_parser best_match
or directive process expression. ,
2 modes: getting length of input current (successful) parser consumes , parsing selected 1 phase 1. wrapper run comma parser first in mode 1 , in mode 2. might ugly, work.
the operator implementation p1 |= p2 |= ...
expensive regarding runtime n > 2. happy around that.
minimum (but nevertheless reasonable) overhead have best_match(p1, ...)
. possible @ all?
for not experienced boost::fusion added questions code (how tos regarding fusion).
it feels wrong, press nary non-terminal parser unary parser scheme. poor understanding seems way done. i'd highly appreciate me out of this.
my environment: boost 1.61, msvc 2015 upd 2, target win32console.
progress:
i think, i'm getting surely. happy see error_invalid_expression
. compile failed because of following proto expression:
boost::proto::exprns_::expr< boost::proto::tagns_::tag::function,boost::proto::argsns_::list5< const boost::proto::exprns_::expr< boost::proto::tagns_::tag::terminal,boost::proto::argsns_::term< mxc::qitoo::tag::best_match > ,0> & ,const boost::spirit::qi::rule< iterator_type,std::string (void), boost::spirit::standard::space_type, boost::spirit::unused_type,boost::spirit::unused_type> & ,const boost::spirit::qi::rule< iterator_type,std::string (void), boost::spirit::standard::space_type, boost::spirit::unused_type, boost::spirit::unused_type> & , const boost::spirit::qi::rule< iterator_type,std::string (void), boost::spirit::standard::space_type, boost::spirit::unused_type, boost::spirit::unused_type> & ,const boost::spirit::qi::rule< iterator_type,std::string (void), boost::spirit::standard::space_type, boost::spirit::unused_type, boost::spirit::unused_type> & > ,5>
this expression describes function-style usage of best_match parser. test rule looks like
qitoo::best_match(id, qualified_id, id, qualified_id)
where id
, qualified_id
rules mentioned above. want. error occurs because expression not have member parse
. okay. digging deeper found qi's meta-compiler support unary, binary , nary (what means variadic) parsers. not support function style syntax. qi seems utilize unary, binary , nary operators only. , allthough nary type supported, more or less obsolete, because qi operators binary max.
end progress edit
#include <boost/spirit/home/qi.hpp> namespace boost { namespace spirit { /////////////////////////////////////////////////////////////////////////// // enablers /////////////////////////////////////////////////////////////////////////// template <> struct use_operator<qi::domain, proto::tag::bitwise_or_assign> // enables |= : mpl::true_ {}; template <> struct flatten_tree<qi::domain, proto::tag::bitwise_or_assign> // flattens |= : mpl::true_ {}; } } namespace mxc { namespace qitoo { namespace spirit = boost::spirit; namespace qi = spirit::qi; namespace fusion = boost::fusion; template <typename elements> struct best_match_parser : qi::nary_parser<mxc::qitoo::best_match_parser<elements>> { // 1 lookahead parse of each qi expression find out rule matches // of input template <typename iterator, typename context, typename skipper> struct best_function { best_function(iterator& first, iterator const& last, context& context, skipper const& skipper, int& best, int& idx, int& size) : first(first), last(last), context(context), skipper(skipper), best(best), idx(idx), size(size) {}; template <typename component> void operator()(component& component) const { iterator f = first; if (component.parse(f, last, context, skipper, qi::unused)) { // please have on how transport best, idx , size // parser context // // guess, nasty hack , done better // advice highliy appreciated int l = f - first; if (l > size) { size = l; best = idx++; } idx++; } } private: int& best; int& idx; int& size; iterator& first; iterator const& last; context& context; skipper const& skipper; }; template <typename context, typename iterator> struct attribute { // put element attributes in tuple typedef typename spirit::traits::build_attribute_sequence < elements, context, spirit::traits::alternative_attribute_transform , iterator, qi::domain >::type all_attributes; // ok, make variant on attribute sequence. note // build_variant makes sure 1) attributes in variant // unique 2) puts unused attribute, if there any, // front , 3) collapses single element variants, variant<t> // t. typedef typename spirit::traits::build_variant<all_attributes>::type type; }; best_match_parser(elements const& elements_) : elements(elements_), size(0), idx(0), best(-1) {} template <typename iterator, typename context, typename skipper, typename attribute> bool parse(iterator& first, iterator const& last , context& context, skipper const& skipper , attribute& attr_) const { best_function<iterator, context, skipper> f(first, last, context, skipper, best, idx, size); // find out parser matches of input fusion::for_each(elements, f); // best >= 0 if @ least 1 parser successful if (best >= 0) { // have best parser identified, how access it? // understand next line can not work, i'm looking // --> auto v = fusion::at<boost::mpl::int_<best>>(elements); }; return false; } template <typename context> qi::info what(context& context) const { qi::info result("best_match"); fusion::for_each(elements, spirit::detail::what_function<context>(result, context)); return result; } elements elements; mutable int best; mutable int idx; mutable int size; }; } } namespace boost { namespace spirit { namespace qi { /////////////////////////////////////////////////////////////////////////// // parser generators: make_xxx function (objects) /////////////////////////////////////////////////////////////////////////// template <typename elements, typename modifiers> struct make_composite<proto::tag::bitwise_or_assign, elements, modifiers> : make_nary_composite < elements, mxc::qitoo::best_match_parser > {}; } namespace traits { /////////////////////////////////////////////////////////////////////////// template <typename elements> struct has_semantic_action<mxc::qitoo::best_match_parser<elements> > : nary_has_semantic_action<elements> {}; /////////////////////////////////////////////////////////////////////////// template <typename elements, typename attribute, typename context , typename iterator> struct handles_container<mxc::qitoo::best_match_parser<elements>, attribute, context , iterator> : nary_handles_container<elements, attribute, context, iterator> {}; } } } namespace qi = boost::spirit::qi; namespace qitoo = mxc::qitoo; using iterator_type = std::string::const_iterator; using result_type = std::string; template<typename parser> void parse(const std::string message, const std::string& input, const std::string& rule, const parser& parser) { iterator_type iter = input.begin(), end = input.end(); std::vector<result_type> parsed_result; std::cout << "-------------------------\n"; std::cout << message << "\n"; std::cout << "rule: " << rule << std::endl; std::cout << "parsing: \"" << input << "\"\n"; bool result = qi::phrase_parse(iter, end, parser, qi::space, parsed_result); if (result) { std::cout << "parser succeeded.\n"; std::cout << "parsed " << parsed_result.size() << " elements:"; (const auto& str : parsed_result) std::cout << "[" << str << "]"; std::cout << std::endl; } else { std::cout << "parser failed" << std::endl; } if (iter == end) { std::cout << "eoi reached." << std::endl; } else { std::cout << "eoi not reached. unparsed: \"" << std::string(iter, end) << "\"" << std::endl; } std::cout << "-------------------------\n"; } int main() { namespace qi = boost::spirit::qi; qi::rule < iterator_type, std::string(), qi::space_type> id = (qi::alpha | qi::char_('_')) >> *(qi::alnum | qi::char_('_')); qi::rule < iterator_type, std::string(), qi::space_type> qualified_id = id >> *(qi::string("::") > id); namespace qitoo = mxc::qitoo; namespace qi = boost::spirit::qi; parse("best match operator, select second rule" , "willy::anton::lutz" , "id |= qualified_id" , id |= qualified_id); }
your example
i don't see how sample requires of this. reorder branches, realize short branch special case of qualified case n=1: live on coliru¹ (or using x3 version if prefer).
the generic case
now, having mentioned x3, has capacity make live easier!
here's think wanted, in general case:
namespace parser { template <typename... parsers> struct longest_parser : x3::parser_base { longest_parser(parsers... sub) : _alternatives {sub...} { } template <typename it, typename ctx, typename other, typename attr> bool parse(it& f, l, ctx& ctx, other const& other, attr& attr) const { auto const saved = f; //// exclude pre-skip length comparisons, pre-skip here: // x3::skip_over(f, l, ctx); auto seq = std::make_index_sequence<sizeof...(parsers)>(); auto best = select_best(f, l, ctx, seq); //std::cout << "longest match @ index #" << best << "\n"; bool ok = dispatch(f, l, ctx, other, attr, best, seq); if (!ok) f = saved; return ok; } private: template <typename it, typename ctx, typename p> size_t length_of(it f, l, ctx ctx, p const& p) const { boost::iterator_range<it> matched; return x3::raw[p].parse(f, l, ctx, x3::unused, matched)? boost::size(matched) : 0; } template <typename it, typename ctx, size_t... i> size_t select_best(it f, l, ctx& ctx, std::index_sequence<i...>) const { std::array<size_t, sizeof...(i)> lengths { length_of(f, l, ctx, std::get<i>(_alternatives))... }; return std::distance(lengths.begin(), std::max_element(lengths.begin(), lengths.end())); } template <typename it, typename ctx, typename other, typename attr, size_t... i> bool dispatch(it& f, l, ctx& ctx, other const& other, attr& attr, size_t targetidx, std::index_sequence<i...>) const { //return (real_parse<i>(f, l, ctx, other, attr, targetidx) || ...); std::array<bool, sizeof...(i)> b = { real_parse<i>(f, l, ctx, other, attr, targetidx)... }; return std::accumulate(b.begin(), b.end(), false, std::logical_or<bool>()); } template <size_t idx, typename it, typename ctx, typename other, typename attr> bool real_parse(it& f, l, ctx& ctx, other const& other, attr& attr, size_t targetidx) const { if (targetidx != idx) return false; return std::get<idx>(_alternatives).parse(f, l, ctx, other, attr); } std::tuple<parsers...> _alternatives; }; template <typename... ps> longest_parser<ps...> longest(ps... p) { return {x3::as_parser(p)...}; } }
note fold expression use in dispatch
if compiler supports (coliru does, edit see!).
note subtle choice regarding skippable (probably whitespace); if it's not significant length comparisons, uncomment pre-skip.
live demo
#include <boost/spirit/home/x3.hpp> #include <type_traits> #include <iostream> #include <numeric> namespace x3 = boost::spirit::x3; namespace std { template <typename t> // easy debug printing; hack static std::ostream& operator<<(std::ostream& os, std::vector<t> const& v) { (auto& el : v) std::cout << '[' << el << ']'; return os; } } using string_vec = std::vector<std::string>; using result_type = boost::variant<std::string, double, string_vec>; template <typename parser> void parse(const std::string message, const std::string &input, const std::string &rule, const parser &parser) { auto iter = input.begin(), end = input.end(); std::cout << "-------------------------\n"; std::cout << message << "\n"; std::cout << "rule: " << rule << "\n"; std::cout << "parsing: '" << input << "'\n"; result_type parsed_result; bool result = phrase_parse(iter, end, parser, x3::space, parsed_result); if (result) { std::cout << "parsed " << parsed_result << "\n"; } else { std::cout << "parser failed\n"; } if (iter != end) std::cout << "eoi not reached. unparsed: '" << std::string(iter, end) << "'\n"; } namespace parser { template <typename... parsers> struct longest_parser : x3::parser_base { longest_parser(parsers... sub) : _alternatives {sub...} { } template <typename it, typename ctx, typename other, typename attr> bool parse(it& f, l, ctx& ctx, other const& other, attr& attr) const { auto const saved = f; //// exclude pre-skip length comparisons, pre-skip here: // x3::skip_over(f, l, ctx); auto seq = std::make_index_sequence<sizeof...(parsers)>(); auto best = select_best(f, l, ctx, seq); //std::cout << "longest match @ index #" << best << "\n"; bool ok = dispatch(f, l, ctx, other, attr, best, seq); if (!ok) f = saved; return ok; } private: template <typename it, typename ctx, typename p> size_t length_of(it f, l, ctx ctx, p const& p) const { boost::iterator_range<it> matched; return x3::raw[p].parse(f, l, ctx, x3::unused, matched)? boost::size(matched) : 0; } template <typename it, typename ctx, size_t... i> size_t select_best(it f, l, ctx& ctx, std::index_sequence<i...>) const { std::array<size_t, sizeof...(i)> lengths { length_of(f, l, ctx, std::get<i>(_alternatives))... }; return std::distance(lengths.begin(), std::max_element(lengths.begin(), lengths.end())); } template <typename it, typename ctx, typename other, typename attr, size_t... i> bool dispatch(it& f, l, ctx& ctx, other const& other, attr& attr, size_t targetidx, std::index_sequence<i...>) const { //return (real_parse<i>(f, l, ctx, other, attr, targetidx) || ...); std::array<bool, sizeof...(i)> b = { real_parse<i>(f, l, ctx, other, attr, targetidx)... }; return std::accumulate(b.begin(), b.end(), false, std::logical_or<bool>()); } template <size_t idx, typename it, typename ctx, typename other, typename attr> bool real_parse(it& f, l, ctx& ctx, other const& other, attr& attr, size_t targetidx) const { if (targetidx != idx) return false; return std::get<idx>(_alternatives).parse(f, l, ctx, other, attr); } std::tuple<parsers...> _alternatives; }; template <typename... ps> longest_parser<ps...> longest(ps... p) { return {x3::as_parser(p)...}; } } int main() { auto id = x3::rule<void, std::string> {} = x3::lexeme [ x3::char_("a-za-z_") >> *x3::char_("a-za-z0-9_") ]; auto qualified = x3::rule<void, string_vec> {} = id % "::"; #define test_case(label, input, rule) parse(label, input, #rule, rule) test_case("unqualified" , "willy" , parser::longest(id, x3::int_, x3::double_)); test_case("unqualified whitespace", " willy \t" , parser::longest(id, x3::int_, x3::double_)); test_case("integral or number" , "123.78::anton::lutz" , parser::longest(id, x3::int_, x3::double_)); test_case("qualified" , "willy anton::lutz" , parser::longest(id, x3::int_, x3::double_)); test_case("qualified whitespace" , "willy \tanton::lutz" , parser::longest(id, x3::int_, x3::double_)); test_case("unqualified" , "willy" , parser::longest(id, x3::int_, x3::double_, qualified)); test_case("unqualified whitespace", " willy \t" , parser::longest(id, x3::int_, x3::double_, qualified)); test_case("integral or number" , "123.78::anton::lutz" , parser::longest(id, x3::int_, x3::double_, qualified)); test_case("qualified" , "willy::anton::lutz" , parser::longest(id, x3::int_, x3::double_, qualified)); test_case("qualified whitespace" , "willy ::\tanton::lutz", parser::longest(id, x3::int_, x3::double_, qualified)); test_case("unqualified" , "willy" , parser::longest(x3::int_, x3::double_, qualified)); test_case("unqualified whitespace", " willy \t" , parser::longest(x3::int_, x3::double_, qualified)); test_case("integral or number" , "123.78::anton::lutz" , parser::longest(x3::int_, x3::double_, qualified)); test_case("qualified" , "willy::anton::lutz" , parser::longest(x3::int_, x3::double_, qualified)); test_case("qualified whitespace" , "willy ::\tanton::lutz", parser::longest(x3::int_, x3::double_, qualified)); }
prints
------------------------- unqualified rule: parser::longest(id, x3::int_, x3::double_) parsing: 'willy' parsed willy ------------------------- unqualified whitespace rule: parser::longest(id, x3::int_, x3::double_) parsing: ' willy ' parsed willy ------------------------- integral or number rule: parser::longest(id, x3::int_, x3::double_) parsing: '123.78::anton::lutz' parsed 123.78 eoi not reached. unparsed: '::anton::lutz' ------------------------- qualified rule: parser::longest(id, x3::int_, x3::double_) parsing: 'willy anton::lutz' parsed willy eoi not reached. unparsed: 'anton::lutz' ------------------------- qualified whitespace rule: parser::longest(id, x3::int_, x3::double_) parsing: 'willy anton::lutz' parsed willy eoi not reached. unparsed: 'anton::lutz' ------------------------- unqualified rule: parser::longest(id, x3::int_, x3::double_, qualified) parsing: 'willy' parsed willy ------------------------- unqualified whitespace rule: parser::longest(id, x3::int_, x3::double_, qualified) parsing: ' willy ' parsed willy ------------------------- integral or number rule: parser::longest(id, x3::int_, x3::double_, qualified) parsing: '123.78::anton::lutz' parsed 123.78 eoi not reached. unparsed: '::anton::lutz' ------------------------- qualified rule: parser::longest(id, x3::int_, x3::double_, qualified) parsing: 'willy::anton::lutz' parsed [willy][anton][lutz] ------------------------- qualified whitespace rule: parser::longest(id, x3::int_, x3::double_, qualified) parsing: 'willy :: anton::lutz' parsed [willy][anton][lutz] ------------------------- unqualified rule: parser::longest(x3::int_, x3::double_, qualified) parsing: 'willy' parsed [willy] ------------------------- unqualified whitespace rule: parser::longest(x3::int_, x3::double_, qualified) parsing: ' willy ' parsed [willy] ------------------------- integral or number rule: parser::longest(x3::int_, x3::double_, qualified) parsing: '123.78::anton::lutz' parsed 123.78 eoi not reached. unparsed: '::anton::lutz' ------------------------- qualified rule: parser::longest(x3::int_, x3::double_, qualified) parsing: 'willy::anton::lutz' parsed [willy][anton][lutz] ------------------------- qualified whitespace rule: parser::longest(x3::int_, x3::double_, qualified) parsing: 'willy :: anton::lutz' parsed [willy][anton][lutz]
note different results depending on parser expressions in alternatives.