Skip to main content
Summarize is currently in private preview. Request access before integrating.
Copy one of these prompts into Claude, ChatGPT, or any AI assistant to generate integration code for the /summarization/abstractive endpoint. Start with the quick integration prompt to get something working fast, or use the production-ready prompt if you’re building for a live environment.

Prompts

Quick integration

Paste this prompt to generate a minimal Python function — useful for prototyping or one-off scripts.
Quick integration prompt
Write a Python function `summarize_text(text: str, api_key: str,
instructions: str = None) -> str` that calls the ScaleDown abstractive
summarization API and returns the summary string.

API details:
- Endpoint: POST https://api.scaledown.xyz/summarization/abstractive
- Auth: HTTP header `x-api-key: <your key>`
- Request body (JSON):
    {
      "text": "<the document or passage to summarize>",
      "instructions": "<optional extra rules>",  // optional field
      "max_tokens": 20048                         // optional field, default 20048
    }
  The "instructions" field is appended to the base faithful-summary behaviour —
  it extends, not replaces, the default. Examples:
    "Use bullet points."
    "Focus on financial figures only."
    "Write in Spanish."
    "Limit to 3 sentences."
- Success response (JSON):
    {
      "summary": "<the generated summary>",
      "input_chars": 8340,
      "output_chars": 142,
      "latency_ms": 3241
    }
- Error responses: 400 (bad request), 401 (invalid key), 429 (rate limited),
  500 (server error)

Requirements:
- Accept the API key as the second parameter.
- Omit "instructions" from the request body if it is None.
- Raise a ValueError with a descriptive message on any non-2xx HTTP response,
  including the status code and response body in the message.
- Return the `summary` string on success.

Production-ready

Paste this prompt to generate a fully typed Python service class with error handling, retries, and environment-variable-based configuration.
Production-ready prompt
Write a production-quality Python module for integrating the ScaleDown abstractive
summarization API.

API details:
- Endpoint: POST https://api.scaledown.xyz/summarization/abstractive
- Auth: HTTP header `x-api-key: <your key>`
- Request body (JSON):
    {
      "text": "<the document or passage to summarize>",
      "instructions": "<optional rules appended to base behaviour>",  // optional
      "max_tokens": 20048                                              // optional, default 20048
    }
  The "instructions" field extends (not replaces) the default faithful-summary
  behaviour. It does not allow the model to add new information or commentary.
  Example values: "Use bullet points.", "Focus on dates and key decisions only.",
  "Write in Spanish.", "Limit to 3 sentences."
- Success response (JSON):
    {
      "summary": "<generated summary>",
      "input_chars": 8340,
      "output_chars": 142,
      "latency_ms": 3241
    }
- Error responses: 400 (malformed body or missing text field), 401 (missing/invalid key),
  429 (rate limit exceeded), 500 (server error)

Apply these programming principles:

1. Environment configuration — Load the API key from the environment variable
   SCALEDOWN_API_KEY. Raise a clear ValueError at construction time if it is missing
   or empty, with a message that tells the developer exactly which variable to set.

2. Typed result — Define a SummaryResult dataclass with fields:
     summary (str), input_chars (int), output_chars (int), latency_ms (int)

3. Custom exception — Define a ScaleDownError exception class that carries
   status_code (int) and message (str), and formats them into the exception message.

4. Single-responsibility client — Implement a ScaleDownSummarizer class with one
   public method:
     summarize(
       text: str,
       instructions: str | None = None,
       max_tokens: int | None = None
     ) -> SummaryResult
   The class owns the requests.Session and sets the auth header once at __init__.
   Omit "instructions" and "max_tokens" from the request payload when they are None,
   rather than sending null values.

5. Retry with exponential backoff — Inside summarize(), on HTTP 429 or any 5xx status,
   wait 2 s before retry 1, 4 s before retry 2, 8 s before retry 3.
   Raise ScaleDownError after all three retries are exhausted.
   Raise ScaleDownError immediately on 400 or 401 (not retriable).

6. Type annotations — Add full type annotations to all functions, methods, and fields.
   No module-level mutable state.

Async batch pipeline

Paste this prompt to generate an async batch summarizer that processes many documents concurrently with controlled parallelism.
Async batch pipeline prompt
Write a Python async module that batch-summarizes a list of documents using the
ScaleDown abstractive summarization API with controlled concurrency.

ScaleDown API details:
- Endpoint: POST https://api.scaledown.xyz/summarization/abstractive
- Auth: HTTP header `x-api-key: <your key>`
- Request body (JSON):
    {
      "text": "<the document or passage to summarize>",
      "instructions": "<optional rules appended to base behaviour>",  // optional
      "max_tokens": 20048                                              // optional
    }
  "instructions" extends, not replaces, the default faithful-summary behaviour.
- Success response (JSON):
    {
      "summary": str,
      "input_chars": int,
      "output_chars": int,
      "latency_ms": int
    }
- Error responses: 400 (bad body), 401 (invalid key), 429 (rate limit), 500 (server error)

Requirements:

1. Environment configuration — Load SCALEDOWN_API_KEY from os.environ at module load.
   Raise ValueError with a clear message if the variable is absent or empty.

2. Public async function:
     batch_summarize(
       texts: list[str],
       instructions: str | None = None,
       max_tokens: int | None = None,
       concurrency: int = 5
     ) -> list[str | None]
   Returns summaries in the same order as the input list.
   Returns None at an index if that document permanently fails after all retries.

3. Concurrency control — Use asyncio + aiohttp. Limit the number of simultaneous
   in-flight requests with an asyncio.Semaphore initialized to the `concurrency`
   argument.

4. Per-request retry with exponential backoff — For each individual request, retry
   up to 3 times on HTTP 429 or any 5xx status: wait 2 s, 4 s, 8 s before retries
   1, 2, and 3. After 3 failures, store None for that index.

5. Logging — Log a WARNING via the `logging` module for each document that permanently
   fails, including its index and the final error.

6. Omit optional fields — Do not include "instructions" or "max_tokens" in the
   request payload when they are None.

7. Entry point — Include an `if __name__ == "__main__":` block that calls
   batch_summarize on a sample list of three short strings and prints each summary.

What the production prompt generates

The production-ready prompt instructs the AI to apply six programming principles. Here is an example of the code it produces:
Principles encoded in the production-ready prompt: environment-variable config, typed result dataclass, custom exception class, single-responsibility service client, retry with exponential backoff, full type annotations.