Parsing the input with nom is faster than using the regex crate.
The input for today looks something like this:
1-3 a: abcde1-3 b: cdefg2-9 c: ccccccccc
I used the same scaffolding for each part and set up a process_password_line
function to process each line.
pub fn process_part1(input: &str) -> usize {input.lines().filter_map(|line| match process_password_line(line) {Ok(true) => Some(true),Ok(false) => None,Err(_) => None,}).count()}
My first attempt was using regex to parse each line. This resulted in some unwieldy code that had a lot of unwraps and additional parsing. The error handling would've been annoying to "get right" so as a result I mostly skipped it. There are also some issues with differing types. You can see that character
is actually a str
which means I have to use contains
instead of equality. Overall kind of a messy way to go about solving the problem, but it definitely works.
fn process_password_line(line: &str) -> Result<bool, std::io::Error> {lazy_static! {static ref RE: Regex = Regex::new(r"([\d]+)-([\d]+) ([a-z]): ([a-z]+)").unwrap();}let caps = RE.captures(line).unwrap();let lower_bound = caps.get(1).unwrap().as_str().parse::<usize>().unwrap();let upper_bound = caps.get(2).unwrap().as_str().parse::<usize>().unwrap();let character = caps.get(3).unwrap().as_str();let password = caps.get(4).unwrap().as_str();let char_count = password.chars().filter(|c| character.contains(*c)).count();if char_count < lower_bound || char_count > upper_bound {Ok(false)} else {Ok(true)}}
Given how messy this was, I wanted to try my hand at using a parser written with nom, a parser combinator library. This felt to me more "fit for purpose" than the regex approach.
nom is a parser combinator library with a focus on safe parsing, streaming patterns, and as much as possible zero copy.
This line from the docs explains the motivation for using nom. My favorite answer to day 1 was a victim of too many allocations, so I wanted to see what nom could do compared to regex.
fn process_password_line(line: &str) -> Result<bool, std::io::Error> {let result = password_line(line).map(|(_, pass)| {let char_count = pass.password.chars().filter(|c| pass.character == *c).count();char_count >= pass.lower.into() && char_count <= pass.upper.into()});match result {Ok(b) => Ok(b),_ => Err(Error::new(ErrorKind::Other, "failed to parse")),}}
Given the parser, the actual logic for dealing with each line becomes significantly simpler. My only hangup was on converting nom's errors into a more "standard" error type.
The parser itself is fairly readable, even if you don't know what a parser combinator is. The important parts are that we're shadowing input
each time, which returns the result minus the parsed piece for the next parser.
use nom::{bytes::complete::tag, character::complete::*, IResult};struct PasswordLine<'a> {lower: u8,upper: u8,character: char,password: &'a str,}fn password_line(input: &str) -> IResult<&str, PasswordLine> {let (input, lower) = digit1(input)?;let (input, _) = tag("-")(input)?;let (input, upper) = digit1(input)?;let (input, _) = tag(" ")(input)?;let (input, parsed_character) = anychar(input)?;let (input, _) = tag(": ")(input)?;let (input, password) = alpha1(input)?;Ok((input,PasswordLine {lower: lower.parse::<u8>().unwrap(),upper: upper.parse::<u8>().unwrap(),character: parsed_character,password,},))}
The regex approach benchmarked at about 1ms while the parser approach benchmarked at 145 nanoseconds. I attribute this to the zero-copy approach nom uses and the complexity of regex engines, but I did not check that assumption.
| example | lower bound | best guess | upper bound | | ------- | ----------- | ---------- | ----------- | | nom | 143.80 us | 144.83 us | 145.99 us | | regex | 984.72 us | 994.95 us | 1007.40 us |