Classify
Classify
Classify
Score text against a set of user-defined labels and return a probability distribution.
POST
Classify
Documentation Index
Fetch the complete documentation index at: https://docs.scaledown.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The/classify endpoint scores a piece of text against a set of labels you define and returns a softmax-normalised probability distribution. Each label is scored using a rubric — a yes/no question that describes what the label means. The label with the highest score is returned as top_label.
Request
The text to classify. Either
text or document must be provided. If both are given, the OCR text is appended after text.A base64-encoded file to classify. Supported formats: JPEG, PNG, TIFF, single-page PDF, multi-page PDF. The file is processed via AWS Textract OCR and the extracted text is used as input. Either
text or document must be provided.MIME type of the document (e.g.
"image/jpeg", "application/pdf"). Required when document is provided.One or more label definitions. Must contain at least one item — sending an empty array returns
422.Response
Name of the highest-scoring label.
Map of label name → probability score. All values sum to
1.0.Full label list with name, score, and rubric, in the same order as the request.
The raw text extracted from the document via OCR.
null if no document was provided.0.85 means the model assigned 85% of its probability mass to that label relative to the others. If you need a confidence threshold (e.g. only act if the top score exceeds 0.7), apply it yourself on the scores field.
Error responses
| Status | Meaning |
|---|---|
422 Unprocessable Entity | Malformed request body, empty labels array, neither text nor document provided, or OCR failed. |
502 Bad Gateway | Model service unavailable or returned an error. |
Authentication
Include your API key in every request using thex-api-key header.
Examples
Basic topic classification
Support ticket triage
Classifying a document
Pass a base64-encoded image or PDF in thedocument field. The OCR text is extracted automatically and classified against your labels. The raw OCR output is returned as ocr_text.
Writing good rubrics
The rubric is the most important part of a classify request. It is phrased as a yes/no question the model uses to score each label. The model scores how strongly the text “answers yes” to the question. Rules of thumb:- Be specific. Vague rubrics produce low-confidence, noisy scores.
- Frame as a direct yes/no question. “Does this text describe X?” works better than “X content”.
- Avoid negations. “Is this text NOT about finance?” will confuse the model. Use a positive label instead.
- Keep rubrics independent. Overlapping rubrics (e.g. “Is this medical?” and “Is this about health?”) will split probability mass unpredictably.
| Label | Poor rubric | Good rubric |
|---|---|---|
medical | medical content | Does this text describe a medical condition, symptom, treatment, or health topic? |
urgent | urgent or important | Does this text indicate that the sender needs an immediate response or is describing a time-sensitive situation? |
complaint | negative feedback | Is this text expressing dissatisfaction, frustration, or a formal complaint about a product or service? |
How it works
- For each label, the model scores the
textagainst the label’srubric. - Raw scores are real-valued numbers (not probabilities).
- Softmax normalisation is applied across all label scores so they sum to 1.0.
- The label with the highest normalised score is returned as
top_label.
scores field.
Notes
- There is no hard limit on the number of labels, but performance degrades with very large sets (>20) since each label requires a separate model call.
- Scores are relative, not absolute. A top score of
0.4in a 10-label request can still be the correct answer — it just means probability mass was spread across many labels.
Authorizations
Body
application/json
The text to classify.
One or more label definitions. Must contain at least one item.