Struct wasmparser::Parser

source · [−]

pub struct Parser { /* private fields */ }

Expand description

An incremental parser of a binary WebAssembly module.

This type is intended to be used to incrementally parse a WebAssembly module as bytes become available for the module. This can also be used to parse modules that are already entirely resident within memory.

This primary function for a parser is the Parser::parse function which will incrementally consume input. You can also use the Parser::parse_all function to parse a module that is entirely resident in memory.

Implementations

impl Parser

pub fn new(offset: u64) -> Parser

Creates a new module parser.

Reports errors and ranges relative to offset provided, where offset is some logical offset within the input stream that we’re parsing.

pub fn parse<'a>(&mut self, data: &'a [u8 ], eof: bool) -> Result<Chunk<'a>>

Attempts to parse a chunk of data.

This method will attempt to parse the next incremental portion of a WebAssembly binary. Data available for the module is provided as data, and the data can be incomplete if more data has yet to arrive for the module. The eof flag indicates whether data represents all possible data for the module and no more data will ever be received.

There are two ways parsing can succeed with this method:

Chunk::NeedMoreData - this indicates that there is not enough bytes in data to parse a chunk of this module. The caller needs to wait for more data to be available in this situation before calling this method again. It is guaranteed that this is only returned if eof is false.
Chunk::Parsed - this indicates that a chunk of the input was successfully parsed. The payload is available in this variant of what was parsed, and this also indicates how many bytes of data was consumed. It’s expected that the caller will not provide these bytes back to the Parser again.

Note that all Chunk return values are connected, with a lifetime, to the input buffer. Each parsed chunk borrows the input buffer and is a view into it for successfully parsed chunks.

It is expected that you’ll call this method until Payload::End is reached, at which point you’re guaranteed that the module has completely parsed. Note that complete parsing, for the top-level wasm module, implies that data is empty and eof is true.

Errors

Parse errors are returned as an Err. Errors can happen when the structure of the module is unexpected, or if sections are too large for example. Note that errors are not returned for malformed contents of sections here. Sections are generally not individually parsed and each returned Payload needs to be iterated over further to detect all errors.

Examples

An example of reading a wasm file from a stream (std::io::Read) and incrementally parsing it.

use std::io::Read;
use anyhow::Result;
use wasmparser::{Parser, Chunk, Payload::*};

fn parse(mut reader: impl Read) -> Result<()> {
    let mut buf = Vec::new();
    let mut parser = Parser::new(0);
    let mut eof = false;
    let mut stack = Vec::new();

    loop {
        let (payload, consumed) = match parser.parse(&buf, eof)? {
            Chunk::NeedMoreData(hint) => {
                assert!(!eof); // otherwise an error would be returned

                // Use the hint to preallocate more space, then read
                // some more data into our buffer.
                //
                // Note that the buffer management here is not ideal,
                // but it's compact enough to fit in an example!
                let len = buf.len();
                buf.extend((0..hint).map(|_| 0u8));
                let n = reader.read(&mut buf[len..])?;
                buf.truncate(len + n);
                eof = n == 0;
                continue;
            }

            Chunk::Parsed { consumed, payload } => (payload, consumed),
        };

        match payload {
            // Each of these would be handled individually as necessary
            Version { .. } => { /* ... */ }
            TypeSection(_) => { /* ... */ }
            ImportSection(_) => { /* ... */ }
            AliasSection(_) => { /* ... */ }
            InstanceSection(_) => { /* ... */ }
            ModuleSection(_) => { /* ... */ }
            FunctionSection(_) => { /* ... */ }
            TableSection(_) => { /* ... */ }
            MemorySection(_) => { /* ... */ }
            EventSection(_) => { /* ... */ }
            GlobalSection(_) => { /* ... */ }
            ExportSection(_) => { /* ... */ }
            StartSection { .. } => { /* ... */ }
            ElementSection(_) => { /* ... */ }
            DataCountSection { .. } => { /* ... */ }
            DataSection(_) => { /* ... */ }

            // Here we know how many functions we'll be receiving as
            // `CodeSectionEntry`, so we can prepare for that, and
            // afterwards we can parse and handle each function
            // individually.
            CodeSectionStart { .. } => { /* ... */ }
            CodeSectionEntry(body) => {
                // here we can iterate over `body` to parse the function
                // and its locals
            }

            // When parsing nested modules we need to switch which
            // `Parser` we're using.
            ModuleCodeSectionStart { .. } => { /* ... */ }
            ModuleCodeSectionEntry { parser: subparser, .. } => {
                stack.push(parser);
                parser = subparser;
            }

            CustomSection { name, .. } => { /* ... */ }

            // most likely you'd return an error here
            UnknownSection { id, .. } => { /* ... */ }

            // Once we've reached the end of a module we either resume
            // at the parent module or we break out of the loop because
            // we're done.
            End => {
                if let Some(parent_parser) = stack.pop() {
                    parser = parent_parser;
                } else {
                    break;
                }
            }
        }

        // once we're done processing the payload we can forget the
        // original.
        buf.drain(..consumed);
    }

    Ok(())
}

pub fn parse_all<'a>(
self,
data: &'a [u8 ]
) -> impl Iterator<Item = Result<Payload<'a>>> + 'a

Convenience function that can be used to parse a module entirely resident in memory.

This function will parse the data provided as a WebAssembly module, assuming that data represents the entire WebAssembly module.

Note that when this function yields ModuleCodeSectionEntry no action needs to be taken with the returned parser. The parser will be automatically switched to internally and more payloads will continue to get returned.

pub fn skip_section(&mut self)

Skip parsing the code or module code section entirely.

This function can be used to indicate, after receiving CodeSectionStart or ModuleCodeSectionStart, that the section will not be parsed.

The caller will be responsible for skipping size bytes (found in the CodeSectionStart or ModuleCodeSectionStart payload). Bytes should only be fed into parse after the size bytes have been skipped.

Panics

This function will panic if the parser is not in a state where it’s parsing the code or module code section.

Examples

use wasmparser::{Result, Parser, Chunk, Range, SectionReader, Payload::*};

fn objdump_headers(mut wasm: &[u8]) -> Result<()> {
    let mut parser = Parser::new(0);
    loop {
        let payload = match parser.parse(wasm, true)? {
            Chunk::Parsed { consumed, payload } => {
                wasm = &wasm[consumed..];
                payload
            }
            // this state isn't possible with `eof = true`
            Chunk::NeedMoreData(_) => unreachable!(),
        };
        match payload {
            TypeSection(s) => print_range("type section", &s.range()),
            ImportSection(s) => print_range("import section", &s.range()),
            // .. other sections

            // Print the range of the code section we see, but don't
            // actually iterate over each individual function.
            CodeSectionStart { range, size, .. } => {
                print_range("code section", &range);
                parser.skip_section();
                wasm = &wasm[size as usize..];
            }
            End => break,
            _ => {}
        }
    }
    Ok(())
}

fn print_range(section: &str, range: &Range) {
    println!("{:>40}: {:#010x} - {:#010x}", section, range.start, range.end);
}

Trait Implementations

impl Clone for Parser

fn clone(&self) -> Parser

Returns a copy of the value. Read more

1.0.0 · source

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl Debug for Parser

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl Default for Parser

fn default() -> Parser

Returns the “default value” for a type. Read more

Auto Trait Implementations

impl RefUnwindSafe for Parser

impl Send for Parser

impl Sync for Parser

impl Unpin for Parser

impl UnwindSafe for Parser

Blanket Implementations

impl<T> Any for T where
T: 'static + ?Sized,

pub fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T where
T: ?Sized,

const: unstable · source

pub fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T where
T: ?Sized,

const: unstable · source

pub fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

const: unstable · source

pub fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T where
U: From<T>,

const: unstable · source

pub fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> ToOwned for T where
T: Clone,

type Owned = T

The resulting type after obtaining ownership.

pub fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

pub fn clone_into(&self, target: &mut T)

🔬 This is a nightly-only experimental API. (toowned_clone_into)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T, U> TryFrom<U> for T where
U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

const: unstable · source

pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T where
U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

const: unstable · source

pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.