I am a Software Engineer. I enjoy working on compilers and open source, and I am interested in the mathematics behind machine learning.

  • OneUpFeeds My personal RSS feed reader — A personal RSS aggregator for following blogs and news feeds in one place.

  • CS229 Project: Separation of Speech From Noise Challenge

    Implemented the spectrographic mask estimation approach from the PASCAL CHiME Speech Separation and Recognition Challenge. The dataset consisted of 3,600 stereo WAV recordings of 34 speakers giving simple commands against real domestic noise (TV, children, kitchen sounds) at six SNR levels from −6 dB to +9 dB. Speech and noise were represented as log-mel spectrograms, and each time-frequency cell was labeled reliable or unreliable using an oracle mask built by comparing clean speech energy to noise energy. An SVM classifier (RBF kernel, hyperparameters chosen by grid search and 5-fold cross-validation) was trained independently for each of the 26 mel-frequency bands and each of the 34 speakers — 884 models in total — using features including subband energy ratios, flatness, kurtosis, and a spectral-subtraction SNR estimate. Adding the SNR estimate as a feature improved per-band classification accuracy from ~70% to ~99%, confirming results from prior work.

  • SVD and its applications in image processing

    Derives singular value decomposition from first principles, starting from the eigendecomposition of ATA and AAT. The paper proves that the columns of V are eigenvectors of ATA, the columns of U are eigenvectors of AAT, and the singular values σi satisfy ‖Avi‖ = σi. A full worked example computes U, Σ, V for a 2×2 matrix by hand and verifies A = UΣVT. The rank-k approximation Ak = UkΣkVkT is then applied to image compression: each RGB channel of the 512×512 Lenna test image is decomposed independently and reconstructed at ranks 8 through 512. At rank 32 (12.5% of original storage) the image is visually recognizable; at rank 128 it is nearly indistinguishable from the original.

  • Balancing chemical equations using Gaussian Elimination

    Applies the law of conservation of matter to reframe chemical equation balancing as solving a homogeneous linear system. For each element, a linear equation equates the number of atoms on the reactant and product sides; the unknowns are the stoichiometric coefficients. The augmented matrix is reduced to row echelon form via Gaussian elimination and back-substitution yields the balanced coefficients. Two reactions are worked through in full: photosynthesis (xCO₂ + yH₂O → zO₂ + wC₆H₁₂O₆, solved to 6:6:6:1) and the oxidation of NADH in cellular respiration (xNADH + yH + zO₂ → pNAD + qH₂O, solved to 2:2:1:2:2). Both systems have infinitely many solutions; a free variable is fixed to the smallest integer giving whole-number coefficients.

  • Laplace Transform Application, Solving Differential Equations describing a projectile in motion

    Models projectile motion as a pair of independent second-order ODEs in the horizontal (x) and vertical (y) directions, incorporating air resistance (proportional to velocity) and a propulsion force modeled as a step function active for time tp. Applying the Laplace transform converts each ODE into an algebraic equation in s; the resulting expressions are simplified using partial fraction decomposition and the second shifting theorem (for the delayed step function) before applying the inverse transform. The closed-form solutions x(t) and y(t) each contain exponential decay terms from air resistance and unit-step terms that switch off the propulsion contribution after tp. Special cases are verified: setting tp = 0 or p = 0 recovers free fall, and setting k = 0 (no air resistance) recovers the classical parabolic trajectory y(t) = −gt²/2.

  • Spam Classification using Naive Bayes

    Implements and benchmarks five Naive Bayes variants on 18,828 newsgroup messages. Multivariate Naive Bayes (MVB) treats each word as a binary feature (present/absent) and achieves 84.98% accuracy; Multinomial Naive Bayes (MNB) uses term frequencies instead and reaches 95.03%. Chi-Square feature selection is used to rank words per newsgroup and retrain models on the top-k vocabulary — MVB improves with fewer features (less noise) while MNB initially drops then recovers as k grows. Three improvements to MNB are then implemented following Rennie et al.: Complement Naive Bayes (CNB, 96.28%) corrects for class-size skew by training on the complement of each class; Weight-normalized CNB (WCNB, 96.09%) normalizes weights to reduce inter-word dependencies; Transformed WCNB with TF, IDF, and length normalization (TWCNB, 97.57%) further reduces bias from document length and common words. Disabling stemming and lowercasing — counterintuitively — pushes TWCNB to 98.65%, suggesting the preprocessing was discarding discriminative signal.

Used to be a Folding@Home contributor