Struct wasmparser::Parser
source · [−]pub struct Parser { /* private fields */ }
Expand description
An incremental parser of a binary WebAssembly module.
This type is intended to be used to incrementally parse a WebAssembly module as bytes become available for the module. This can also be used to parse modules that are already entirely resident within memory.
This primary function for a parser is the Parser::parse
function which
will incrementally consume input. You can also use the Parser::parse_all
function to parse a module that is entirely resident in memory.
Implementations
sourceimpl Parser
impl Parser
sourcepub fn new(offset: u64) -> Parser
pub fn new(offset: u64) -> Parser
Creates a new module parser.
Reports errors and ranges relative to offset
provided, where offset
is some logical offset within the input stream that we’re parsing.
sourcepub fn parse<'a>(&mut self, data: &'a [u8], eof: bool) -> Result<Chunk<'a>>
pub fn parse<'a>(&mut self, data: &'a [u8], eof: bool) -> Result<Chunk<'a>>
Attempts to parse a chunk of data.
This method will attempt to parse the next incremental portion of a
WebAssembly binary. Data available for the module is provided as data
,
and the data can be incomplete if more data has yet to arrive for the
module. The eof
flag indicates whether data
represents all possible
data for the module and no more data will ever be received.
There are two ways parsing can succeed with this method:
-
Chunk::NeedMoreData
- this indicates that there is not enough bytes indata
to parse a chunk of this module. The caller needs to wait for more data to be available in this situation before calling this method again. It is guaranteed that this is only returned ifeof
isfalse
. -
Chunk::Parsed
- this indicates that a chunk of the input was successfully parsed. The payload is available in this variant of what was parsed, and this also indicates how many bytes ofdata
was consumed. It’s expected that the caller will not provide these bytes back to theParser
again.
Note that all Chunk
return values are connected, with a lifetime, to
the input buffer. Each parsed chunk borrows the input buffer and is a
view into it for successfully parsed chunks.
It is expected that you’ll call this method until Payload::End
is
reached, at which point you’re guaranteed that the module has completely
parsed. Note that complete parsing, for the top-level wasm module,
implies that data
is empty and eof
is true
.
Errors
Parse errors are returned as an Err
. Errors can happen when the
structure of the module is unexpected, or if sections are too large for
example. Note that errors are not returned for malformed contents of
sections here. Sections are generally not individually parsed and each
returned Payload
needs to be iterated over further to detect all
errors.
Examples
An example of reading a wasm file from a stream (std::io::Read
) and
incrementally parsing it.
use std::io::Read;
use anyhow::Result;
use wasmparser::{Parser, Chunk, Payload::*};
fn parse(mut reader: impl Read) -> Result<()> {
let mut buf = Vec::new();
let mut parser = Parser::new(0);
let mut eof = false;
let mut stack = Vec::new();
loop {
let (payload, consumed) = match parser.parse(&buf, eof)? {
Chunk::NeedMoreData(hint) => {
assert!(!eof); // otherwise an error would be returned
// Use the hint to preallocate more space, then read
// some more data into our buffer.
//
// Note that the buffer management here is not ideal,
// but it's compact enough to fit in an example!
let len = buf.len();
buf.extend((0..hint).map(|_| 0u8));
let n = reader.read(&mut buf[len..])?;
buf.truncate(len + n);
eof = n == 0;
continue;
}
Chunk::Parsed { consumed, payload } => (payload, consumed),
};
match payload {
// Each of these would be handled individually as necessary
Version { .. } => { /* ... */ }
TypeSection(_) => { /* ... */ }
ImportSection(_) => { /* ... */ }
AliasSection(_) => { /* ... */ }
InstanceSection(_) => { /* ... */ }
ModuleSection(_) => { /* ... */ }
FunctionSection(_) => { /* ... */ }
TableSection(_) => { /* ... */ }
MemorySection(_) => { /* ... */ }
EventSection(_) => { /* ... */ }
GlobalSection(_) => { /* ... */ }
ExportSection(_) => { /* ... */ }
StartSection { .. } => { /* ... */ }
ElementSection(_) => { /* ... */ }
DataCountSection { .. } => { /* ... */ }
DataSection(_) => { /* ... */ }
// Here we know how many functions we'll be receiving as
// `CodeSectionEntry`, so we can prepare for that, and
// afterwards we can parse and handle each function
// individually.
CodeSectionStart { .. } => { /* ... */ }
CodeSectionEntry(body) => {
// here we can iterate over `body` to parse the function
// and its locals
}
// When parsing nested modules we need to switch which
// `Parser` we're using.
ModuleCodeSectionStart { .. } => { /* ... */ }
ModuleCodeSectionEntry { parser: subparser, .. } => {
stack.push(parser);
parser = subparser;
}
CustomSection { name, .. } => { /* ... */ }
// most likely you'd return an error here
UnknownSection { id, .. } => { /* ... */ }
// Once we've reached the end of a module we either resume
// at the parent module or we break out of the loop because
// we're done.
End => {
if let Some(parent_parser) = stack.pop() {
parser = parent_parser;
} else {
break;
}
}
}
// once we're done processing the payload we can forget the
// original.
buf.drain(..consumed);
}
Ok(())
}
sourcepub fn parse_all<'a>(
self,
data: &'a [u8]
) -> impl Iterator<Item = Result<Payload<'a>>> + 'a
pub fn parse_all<'a>(
self,
data: &'a [u8]
) -> impl Iterator<Item = Result<Payload<'a>>> + 'a
Convenience function that can be used to parse a module entirely resident in memory.
This function will parse the data
provided as a WebAssembly module,
assuming that data
represents the entire WebAssembly module.
Note that when this function yields ModuleCodeSectionEntry
no action needs to be taken with the returned parser. The parser will be
automatically switched to internally and more payloads will continue to
get returned.
sourcepub fn skip_section(&mut self)
pub fn skip_section(&mut self)
Skip parsing the code or module code section entirely.
This function can be used to indicate, after receiving
CodeSectionStart
or ModuleCodeSectionStart
, that the section
will not be parsed.
The caller will be responsible for skipping size
bytes (found in the
CodeSectionStart
or ModuleCodeSectionStart
payload). Bytes should
only be fed into parse
after the size
bytes have been skipped.
Panics
This function will panic if the parser is not in a state where it’s parsing the code or module code section.
Examples
use wasmparser::{Result, Parser, Chunk, Range, SectionReader, Payload::*};
fn objdump_headers(mut wasm: &[u8]) -> Result<()> {
let mut parser = Parser::new(0);
loop {
let payload = match parser.parse(wasm, true)? {
Chunk::Parsed { consumed, payload } => {
wasm = &wasm[consumed..];
payload
}
// this state isn't possible with `eof = true`
Chunk::NeedMoreData(_) => unreachable!(),
};
match payload {
TypeSection(s) => print_range("type section", &s.range()),
ImportSection(s) => print_range("import section", &s.range()),
// .. other sections
// Print the range of the code section we see, but don't
// actually iterate over each individual function.
CodeSectionStart { range, size, .. } => {
print_range("code section", &range);
parser.skip_section();
wasm = &wasm[size as usize..];
}
End => break,
_ => {}
}
}
Ok(())
}
fn print_range(section: &str, range: &Range) {
println!("{:>40}: {:#010x} - {:#010x}", section, range.start, range.end);
}
Trait Implementations
Auto Trait Implementations
impl RefUnwindSafe for Parser
impl Send for Parser
impl Sync for Parser
impl Unpin for Parser
impl UnwindSafe for Parser
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcepub fn borrow_mut(&mut self) -> &mut T
pub fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
sourceimpl<T> ToOwned for T where
T: Clone,
impl<T> ToOwned for T where
T: Clone,
type Owned = T
type Owned = T
The resulting type after obtaining ownership.
sourcepub fn to_owned(&self) -> T
pub fn to_owned(&self) -> T
Creates owned data from borrowed data, usually by cloning. Read more
sourcepub fn clone_into(&self, target: &mut T)
pub fn clone_into(&self, target: &mut T)
toowned_clone_into
)Uses borrowed data to replace owned data, usually by cloning. Read more