Scaling Database Access in Rust

Scaling Database Access in Rust

Safe and Efficient Database Operations in Rust with Prisma

At Gitar, we are building a complex, rapidly-evolving application with Rust as a core part of our tech stack. As our engineering team scales, ensuring safe and efficient database access across our codebase has become a critical challenge. In this post, we'll dive into how we leveraged Prisma, a modern Object-Relational Mapping (ORM) framework, to achieve compile-time query safety while empowering our developers to move fast. We'll cover our decision process, the benefits and tradeoffs of ORMs versus raw SQL, and the nitty-gritty of integrating Prisma into our team's workflow. Let's jump in!

The SQL Client vs ORM Dilemma

When interacting with databases in Rust, developers have two primary approaches to choose from:

  1. SQL query builders and client libraries such as sqlx or tokio-postgres that allow writing raw SQL queries. This approach has several advantages:

    • It retains the full expressiveness of SQL, which is crucial for complex queries with numerous joins, subqueries, or database-specific syntax.

    • It keeps developers close to the actual SQL being executed, which can aid in performance tuning and debugging.

However, there are also some cons:

  • Dynamically constructing queries through string manipulation can be verbose and error-prone.

  • SQL clients often don't provide compile-time checks for the validity of the SQL queries, which can lead to runtime errors if there are typos or mismatches between the query and the actual database schema.1

  1. ORMs (Object-Relational Mappers) like Prisma, Diesel, or SeaORM. These provide a domain-specific language (DSL) in Rust for defining your database schema and a high-level, type-safe API for querying and mutating data. The benefits of this approach include:

    • Ergonomic query building: ORMs provide expressive, chainable methods for constructing queries, with the help of Rust's type system to guide developers.

    • Simplified handling of complex queries: ORMs often provide utilities for common query patterns like filtering, batch updates, and relation traversal.

    • Strong type safety: the ORM generates Rust types from your database schema, often catching issues at compile time.

However, ORMs also come with some tradeoffs:

  • Abstraction overhead: the ORM's API can be less intuitive for very complex queries compared to writing raw SQL.

  • Performance: while modern ORMs are highly optimized, there is inherently some overhead compared to hand-tuned SQL, as each query must be pre-processed by the ORM’s engine.

Ultimately, there's no one-size-fits-all answer. The choice depends on the specific needs and constraints of each project and team. Some teams even adopt a hybrid approach, using an ORM for most queries while dropping down to raw SQL for performance-critical sections involving more complicated queries.

Why We Chose Prisma

At Gitar, we've embraced Prisma as our primary tool for database access in Rust. Several factors influenced this decision:

  1. Type-safe schema definition: Prisma's schema DSL allows us to declaratively define our database tables, columns, indexes, and relations in a type-safe manner. This serves as a single source of truth for our database schema and provides compile-time validation of our data model.

Example Schema Snippet from Prisma Docs:

model User {
  id        Int      @id @default(autoincrement())
  createdAt DateTime @default(now())
  email     String   @unique
  name      String?
  role      Role     @default(USER)
}

enum Role {
  USER
  ADMIN
}
  1. Ergonomic migrations: With Prisma, evolving our database schema is as simple as modifying the schema file and running a prisma migrate command. Prisma determines the necessary SQL commands to transition the database to the new schema, handling the complexities of schema migrations.

  2. Powerful, type-safe query building: Prisma generates a custom, type-safe client based on our schema definition. This client provides an expressive, auto-completing API for constructing queries, giving us the benefits of an ORM while retaining the flexibility to drop down to raw SQL for more complex queries when needed.

Integrating Prisma into Our Workflow

Adopting Prisma in a team environment introduced some additional challenges beyond the typical single-developer setup:

  1. Schema synchronization: Prisma relies on a central schema definition file, which needs to be kept in sync across all developers' local environments and CI. Whenever a developer makes a change to the schema, they need to regenerate the Prisma client via cargo prisma generate to update the generated Rust types.

  2. Handling generated code: The Prisma client is generated code, which isn't typically checked into version control. However, in this case, it's not just a matter of best practices; the generated client actually can't be checked into version control due to its build-time dependency on absolute file paths specific to each developer's environment.2 Naively gitignoring the generated client would force developers to manually regenerate it on almost every pull, even if they weren’t directly working on logic that interacted with the Prisma client.

To streamline this, we developed a custom build script integration:

  1. We add the generated Prisma client to our .gitignore to keep our repository clean of generated code.

  2. In the Rust crate where the generated Prisma client lives, we add a build.rs script.

[package]
# ...
build = "build.rs"
  1. This script runs the cargo prisma generate command to regenerate the Prisma client whenever the crate needs to be built (which will be the case whenever application or library crates that interact with the client are built).

    • To avoid Cargo deadlocks, we use the temp-envcrate in our build script to set the CARGO_TARGET_DIR environment variable to a separate directory for the prisma generate command. This is necessary because cargo prisma generate itself invokes Cargo to build the Prisma client library. If this inner Cargo invocation uses the same target directory as the outer Cargo process that's running our build script, it can lead to a deadlock situation where both processes are waiting for exclusive access to the same directory. By temporarily overriding CARGO_TARGET_DIR for the prisma generate command, we ensure that it uses a separate directory for its build artifacts, avoiding any potential conflicts with the main build process.
 // Determine if the client needs to be regenerated.
 let regenerate = if prisma_client_path.exists() {
   // Check to see if the schema has been modified since the last time the client was generated
   let prisma_client_metadata = fs::metadata(&prisma_client_path).unwrap();
   let schema_metadata = fs::metadata(&schema_path).unwrap();
   schema_metadata.modified().unwrap() > prisma_client_metadata.modified().unwrap()
 } else {
   // Generate if doesn't exist
   true
 };

 if regenerate {
   // Set the Cargo target directory to a different location from the main project to avoid a deadlock
   let prisma_gen_target_dir = git_root.join("build/prisma_gen_target");
   temp_env::with_var("CARGO_TARGET_DIR", Some(prisma_gen_target_dir), || {
     Command::new("cargo")
       .current_dir(git_root)
       .args([
         "prisma",
         "generate",
         "--schema",
         schema_path.to_str().unwrap(),
       ])
       .output()
       .expect("failed to regenerate Prisma client.");
   });
 }
  1. With this setup, developers simply need to use cargo build as usual. The build script ensures the Prisma client is always up to date, without requiring any manual intervention. There is a small one-time cost per repo clone of setting up a separate cargo build cache that results in additional build time for the code generation step, but that has been a worthwhile tradeoff for the improved developer experience and codebase reliability.

This script is also visible in our public Tunes repository, where we aim to share useful code snippets, utilities, and patterns with the wider developer community.

An alternative approach we considered was building the Prisma CLI binary separately from the rest of our code, perhaps making it downloadable via our onboarding bootstrap script as an internal artifact, and invoking the binary directly from build.rs. This would eliminate any concerns around Cargo deadlocking, and remove the need for temporarily setting CARGO_TARGET_DIR. However, the maintainers of the Prisma Rust Client explicitly discourage this type of usage due to the potential for version mismatches between the CLI binary and the Prisma client library.

Summary

Prisma has allowed us to leverage the power of a modern ORM while retaining the performance and reliability benefits of Rust. By integrating Prisma deeply into our build process, we've ensured that our developer experience remains stable and intuitive, even as our codebase and team scale.

Are you using Rust in production and have experience with database tooling? We'd love to hear your perspective! Join our Slack community to discuss tradeoffs and share your experiences.


  1. sqlx is fairly unique in that it does actually provide an optional way to achieve compile-time query checking by connecting to a live database at compile time and having the database itself verify the SQL queries. However, this approach cannot verify all dynamically constructed queries and requires a more complex setup, slowing down the build process.

  2. The Prisma Client Rust library currently uses the include_dir macro to include the schema file and migration files, which requires absolute file paths. In a discussion with the maintainers, we explored a potential solution using the include_str macro instead, which would allow for relative paths and make the generated client more portable. The maintainers expressed openness to accepting a pull request implementing this change.