Skip to content

Files chunked

Prompt 1: Generate the first 50 files

Imagine that everything you may wish to know about a person could be compacted into a single folder called `data`. We might imagine that the folder would have top-level subfolders called `professional` and `personal`.

This folder would consist of files which would record contextual information about that person's life.

For example, there might be a file called `hobbies`; one called `current job details`; another called `career aspirations`.

Your task is to come up with **the first 50 files** in this structure. Structure your output in file tree notation.

Prompt 2: Generate the next 50 files (files 51-100)

Now, continue from where we left off. Generate **the next set of 50 files**, starting from file number 51 up to file number 100, in the same structure as before.

Prompt 3: Continue with another chunk (files 101-150)

Continue generating files from where we left off. This time, create **the next set of 50 files**, starting from file number 101 up to file number 150, maintaining the same structure as before.

Prompt 4: Final chunk (files 451-500)

We are almost done! Now generate **the final set of 50 files**, starting from file number 451 up to file number 500, following the same structure as before.

Why This Works

  • Memory Efficiency: By breaking down your request into smaller chunks (e.g., groups of 50), you avoid overwhelming the model's memory or token limit.
  • Easier Debugging: If any errors occur in one chunk, it’s easier to correct them without having to regenerate everything.
  • Consistency: Since each chunk follows the same structure, combining them later is straightforward.

Technical Implementation

If you're using a tool like LangChain or similar frameworks for handling long texts or prompts, you can also automate this process using recursive or fixed-size chunking methods. For instance:

from langchain.text_splitter import RecursiveCharacterTextSplitter

text = "..." # your text
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 256,
    chunk_overlap = 20
)

docs = text_splitter.create_documents([text])