Posts
-
GMV Forecasting via xDeepFM
: Machine Learning, Deep Learning
In this post, I aim to share how I conducted a proof of concept (PoC) to solve a real-world problem using deep learning techniques, emphasizing a clean code structure.
-
Decoding LoRA: A Comprehensive Summary on Low-Rank Adaptation
: Machine Learning, Deep Learning, NLP, LLM
Recently, I came across an intriguing article on low-rank techniques employed in Large Language Models (LLM) specifically focusing on LoRA: Low-Rank Adaptation of Large Language Models. Here’s a succinct summary of the key concepts, along with additional discussions.
-
Comparing Sequence-to-Sequence Decoders: With and Without Attention
: Machine Learning, Deep Learning, NLP
This post goes beyond a conventional code walkthrough inspired by this tutorial. My goal is to elevate the narrative by offering a comprehensive comparison between Seq2Seq (sequence-to-sequence) models with and without attention.
-
Note: Derivation of Normal Bayesian Test
The Normal Bayesian test is a statistical method used in hypothesis testing, particularly in the context of Bayesian statistics. It is applied to assess the validity of a null hypothesis ($H_0$) versus an alternative hypothesis ($H_1$) by considering the posterior distribution of a parameter of interest. This method is exemplified in Statistical Inference (2nd edition) by Casella and Berger, as shown on page 379.
-
Capturing Dominant Spatial Patterns with Two-Dimensional Locations Using SpatPCA
: Software, R, Statistics, Spatial Statistics
In this demonstration, we showcase how to utilize SpatPCA for analyzing two-dimensional data to capture the most dominant spatial pattern.
-
Apply SpatPCA to Capture the Dominant Spatial Pattern with One-Dimensional Locations
: Software, R, Statistics, Spatial Statistics
In this tutorial, we explore the application of SpatPCA to capture the most dominant spatial patterns in one-dimensional data, highlighting its performance under varying signal-to-noise ratios.
-
個人資料去識別化
: De-identification, Statistics
在這提倡開放資料(open data)的時代,各機構從資料收集、維護到能夠公開分享,如何妥善的保護資料中的個人隱私是個廣泛討論的議題。例如,使用者希望政府資料能夠增加資料的透明程度以及增加新的資訊而增加資料的可用性,進而可以提供政府更有效的決策。為了確保隱私,避免暴露其身份,從圖1我們可以清楚了解資料收集個人資訊到最終分享資料的流程。當中,在公開資料前為了達到隱私保護的效果,必須對於資料執行適當保護措施 – 去識別化(de-identification)。去識別化是讓機構能夠從自己資料庫移除個人資訊的工具,使得其資料能二次使用或是分享給其他機構做學術或是商業相關的研究用途。
-
Challenges in EOF Patterns with a Single Variable
: Statistics, Spatial Statistics
In this post, we delve into potential challenges associated with Empirical Orthogonal Function (EOF) patterns.
-
Exploring Dominant Spatial Patterns with a Single Variable
: Statistics, Spatial Statistics
In this post, we delve into the formal exploration of dominant spatial patterns, particularly in the context of climate research.
-
How to Work on Sea Surface Temperature (SST) Data
: R, Spatial Statistics, Statistics
In this post, I will show you step-by-step instructions to work on SST data in R.
-
Three Fundamental Aspects of Statistical Models
Recently, I delved into a classic Annals article, Additive Regression and Other Nonparametric Models” by Stones (1985). The insights gleaned from this piece revolve around three fundamental aspects of statistical models:
subscribe via RSS