Tasha Pais
tashapais at gmail dot com

I'm currently a founding ML engineer at Open Vision Engineering, an AI hardware company to scale human cognition, where I lead the development of Pocket.

I started as a firmware engineer and wrote the opus codec compression implementation and microcontroller configuration in C++ working with pcm16, buffer management in psram, and raw audio processing. My role later transitioned into app development using a strongly typed language called Dart that compiles to native ARM code for iOS using AOT (Ahead-of-Time) compilation, which gives Flutter its fast performance. I've worked with AWS private objects, polling and webhooks, fine tuning our speech to text model based on audio embeddings, and function calling to enable our model to fetch data and take actions.

Previously, I studied Computer Science and Neuroscience at Rutgers and transferred to Columbia University in NYC where I worked on robotics research under Shuran Song and Kostas Bekris. I was awarded the the JFK Medical Center Scholarship, Dennis Michael Walker Academic Award, and the Computer Science Departmental Scholarship.

My current interests most align with mechanistic interpretability of language models using lora adapter perturbations and cross-layer transcoders. My personal philosophy has shifted over time; these days it leans toward pyrrhonian skepticism and empty individualism. As we approach post-scarcity, what drives me most is accelerating our understanding of cognition and more broadly building deep tech companies that take on scientific risk.

CV / Twitter / Github / Linkedin

Updates

Projects

Computational Robotics

Implementations of Particle Filter, Kalman Filter, and Extended Kalman Filter for Localization

The project delves into sensor technologies for robotics, emphasizing localization and Bayesian reasoning for accurate robot positioning. It explores Bayesian filtering techniques, applying them to occupancy grids for spatial representation and navigation. The study further investigates particle filters and Kalman filters for dynamic system state estimation and noise filtering.
December 2023
Report   •   Github

Computational Strategies in Robotic Motion Planning: PRM, PRM*, RRT, RRT*

The project starts by implementing probabilistic roadmaps (PRM) and rapidly-exploring random trees (RRT). It advances into asymptotically optimal sampling-based planners, integrating potential functions for improved path efficiency. Additionally, it encompasses the challenges posed by non-holonomic and under-actuated systems, to build a steerable kinematic model.
November 2023
Report   •   Github

Collision-Free Navigation up to 6 dimensions R3 x SO(3)

The project encapsulates the study and application of robotic path planning techniques, exploring grid-based search algorithms, visibility graphs assessment, and combinatorial planning. It delves into spatial analysis through trapezoidal decomposition and C-space introduction, abstracts these configuration spaces to simplify complex planning scenarios, and applies the foundational concepts of sampling-based motion planning strategies.
October 2023
Report   •   Github

Machine Learning

Bayes risk, gaussian based generative classifiers, EM for GMM
Proofs   •   Colab
Softmax over negative margins of the ensemble to derive adaboost
Proofs   •   Colab
SVM in dual and primal, kernel trick, stochastic subgradient descent pegasos
Proofs   •   Colab   •   Handwritten notes
l2 regularization, cross entropy loss, Large margin learning, representer theorem
Proofs   •   Colab
Least squares estimation, maximum likelihood, asymmetric squared loss
Proofs   •   Colab
Sep-Dec 2023

Jacobian Chain Rule and Linear Algebra Refresher  •   Everything I Know Written As Tiny As Possible
* all projects were implemented on datasets like MNIST, CIFAR10, California housing, etc.

Blockchain

Tile Stacking Game Using Three.js That Mints Highest Score on Phantom Wallet

I built a Tile Stacking Game leveraging Three.js for its 3D graphics, making the gameplay visually engaging. On the frontend, I wrote the game logic in JavaScript, ensuring a smooth user experience. For the backend, I chose Node.js and Express.js to handle server-side operations and API requests. To integrate blockchain functionalities, especially for minting high scores as NFTs, I employed Solana's web3.js library and connected to the Phantom Wallet.
May 2022
Github  •   Link to play

Quadratic Voting Application Deployed on Polygon Matic Testnet

Quadratic voting application deployed on polygon, developed in Hardhat and Next.js, smart contract tests written in solidity and javascript, inspired by Vitalik's blog post advocating for nonlinear cost functions
December 2022
Github   •   Deployed link

Operating Systems

Architecting Advanced Memory Management: From Custom Malloc to Multilevel Page Tables and TLB Caching

Custom malloc for virtual address allocation; two-level page table for 32-bit address translation; direct-mapped TLB for efficient address translation caching; bit manipulation for efficient memory tracking; designed a 4-level page table for 64-bit addressing; ensured thread safety and compatibility with various page sizes, benchmarked using matrix multiplication.
November 2023
Report   •   Github

Designing Concurrency: Advanced Thread Management and Scheduling Techniques

Thread creation, yielding, exiting, joining, and synchronization using mutexes; scheduling policies including Pre-emptive Shortest Job First (PSJF) and Multi-Level Feedback Queue (MLFQ); thread context management through makecontext, swapcontext, and ucontext APIs
October 2023
Report   •   Github

Systems Programming

Design and Implementation of a Multiplayer Tic-Tac-Toe Game Service

Advanced socket programming, multitasking (or select() for I/O multiplexing), and thread synchronization for secure, simultaneous game state management. Extended functionality includes interruption handling via signals, specifically using pthread_kill() to signal threads and pthread_sigmask() for signal reception control, leveraging SIGUSR1 and SIGUSR2 for managing blocked system calls.
April 2023
Github

Recreating a Command-Line Shell

I designed and implemented a command-line shell akin to bash or zsh. I utilized Posix stream I/O for unbuffered input and output operations, manipulated the working directory, and spawned child processes to execute user commands while capturing their exit statuses. My implementation leveraged dup2() and pipe() for redirecting standard input and output, enabling the construction of pipelines between commands.
March 2023
Github

Computer Architecture

Crafting a Cache Simulator: Direct-mapped, N-way and Fully Associative

This simulator dealt with simulating memory operations like reading and writing individual bytes, employing different cache mapping strategies, and implementing replacement policies. Specifically, it simulated write-through, write-allocate cache behavior, and introduced prefetching to improve spatial locality benefits.
December 2021
Github

Truth Table Generator for Digital Circuits in C

This program interprets a custom specification language describing circuits' inputs, outputs, and the logical gates connecting them. By efficiently parsing these descriptions, "truthtable" computes and prints all possible input combinations alongside their corresponding outputs, offering insights into the digital logic underlying the specified circuitry.
November 2021
Github

Machine Learning: One-Shot Learning for Home Price Prediction in C

Programming in C, Unix environment, File I/O, dynamic memory allocation, machine-learning algorithm implementation, Gauss-Jordan elimination, matrix handling, "one-shot" learning algorithm application, weight calculation from attributes, matrices for attribute and price representation, matrix transformation and inversion.
October 2021
Github

Consulting

Building low-cost, cloud computing institutes in Dharavi, Mumbai in Partnership with UN SDG 12

In partnership with the UN, presented at 2021 general assembly; building low-cost, cloud computing institutes in Dharavi, Mumbai; refer to slides 28 and 29 for break-even calculations with net output and capital cushions
May 2021
Deck