Model Information

Technical specifications and details for all CosmicFish models.

Available Models

CosmicFish 90M

Balanced model for everyday AI tasks.

Parameters: 90 Million
Context: 512 tokens

CosmicFish 120M

Bigger model with excellent performance.

Parameters: 120 Million
Context: 512 tokens

CosmicFish 300M

Advanced model for coding and reasoning.

Parameters: 369 Million
Context: 2048 tokens

Technical Architecture

Rotary Positional Embeddings

Enhanced position awareness for better context understanding.

Grouped-Query Attention

Optimized attention reducing computational requirements by 40%.

SwiGLU Activation

Advanced activation function improving model convergence.

RMSNorm

Efficient normalization enhancing stability and reducing cost.

Quantization

4-bit and 8-bit precision reducing model size by 75%.

Training Data

Billions of tokens from web, research papers, and code datasets.

Training Datasets

CosmicSet 1.0

Total Tokens: 6B tokens
Sources: Web, Wikipedia

CosmicSet 2.0

Total Tokens: 60B tokens
Sources: Web, Wikipedia, code, math, Academic Papers