Making Code Efficient with LLMs. Part 2

Intro

In the previous article, we explored how LLMs can enhance the efficiency of Python code.

In this post, I will discuss the refactoring of pyEntropy library using LLMs, focusing on two key aspects:

Reviewing the code for style and documentation enhancements.
Performing refactoring of functions.

Style and Documentation

During the review process, it was observed that some functions had poor descriptions and were marked as private, even though their functionality seemed useful for non-private usage as well. Therefore, the first step taken was to review these functions and update their names accordingly:

_embed was renamed to time_delay_embedding.

Additionally, almost all functions now have new docstrings that provide detailed information about their types and can be easily accessed within integrated development environments. These improvements aim to enhance the overall understanding and usability of the library.

Refactoring of Functions

Refactoring functions can be a challenging task that involves reviewing their complexity and finding ways to improve them. In this iteration, a different approach was taken, utilising the concept of Chain of Thought.

Chain of Thought or CoT refers to breaking down a problem into a series of intermediate reasoning steps. This approach has shown significant improvements in the ability of Language Models to perform complex reasoning tasks. For more information on Chain of Thought, you can refer to the following resources: 1 and 2.

These references provide valuable insights into the use and benefits of CoT in enhancing the capabilities of LLMs.

To enhance the process of refactoring functions, a new three-step approach was introduced:

The first step involved asking the LLM to review the function and provide an explanation of its purpose and functionality. This helped gain a clear understanding of the function’s current implementation.

Let’s review function in python that uses numpy library. Our goal is to improve its efficiency. Do not rush to improvements. As the first step, let’s review it and build understanding of what it does. Here is a function:
In the second step, the LLM was tasked with proposing changes and improvements to the function. This allowed for a more thorough analysis of potential enhancements, taking into account different perspectives and considerations.

Now continue to the improvements
Only after these two steps were completed did the LLM proceed to provide a refactored version of the function. This approach deviated from the traditional one-hop approach, which typically suffices for simple functions in the majority of cases. The new approach aimed to foster a more comprehensive and refined refactoring process, yielding improved function designs.

By incorporating these three steps, the refactoring process became more iterative and robust, leveraging the strengths of the LLM in understanding and optimising code. An example of the approach you can find here.

To ensure the correctness of the function rewrites and actual improvements, three checks were implemented:

Firstly, unit tests were conducted for both the original and new functions using np.allclose. This helps validate that the outputs of the functions remain consistent after refactoring.

Secondly, property-based testing was performed using the hypothesis library. For instance, a test for the weighted_permutation_entropy function was defined as follows:

from hypothesis import given, strategies as st
from hypothesis.extra.numpy import arrays


class TestRewrites(unittest.TestCase):
  @given(
    arrays(np.float64, 100),
    st.integers(min_value=1, max_value=5),
    st.integers(min_value=1, max_value=5),
    st.booleans()
  )
  def test_weighted_permutation_entropy(
    self,
    time_series,
    order,
    delay,
    normalize
  ):
    np.testing.assert_allclose(
      weighted_permutation_entropy(
        time_series, order, delay, normalize
      ),
      weighted_permutation_entropy_rewrite(
        time_series, order, delay, normalize
      )
    )


if __name__ == '__main__':
  unittest.main()

This property-based test generates random input arrays and verifies that the outputs of both the original and refactored versions of the weighted_permutation_entropy function are close. Such tests help ensure that the refactored functions maintain their intended behaviour and produce reliable results.

Furthermore, the efficiency of the changes made to the functions was evaluated using timeit.repeat. This measurement helped to ensure that the refactored functions exhibited improved efficiency compared to their original counterparts.

As a result of the refactoring process, the rewrites can be categorised into two distinct groups:

Unsuccessful refactoring:

Some rewrites resulted in incorrect results, deviating from the expected output.
In certain cases, the refactored functions did not demonstrate any improvements or even performed worse than the original functions.

Successful refactoring:

On the other hand, there were successful rewrites that yielded the desired improvements in terms of functionality, efficiency, or both.

These categories help identify the effectiveness of the refactoring process and provide insights into the specific areas where improvements were achieved or fell short.

Unsuccessful refactoring

In the case of functions that involve multiple intricate steps, the refactoring attempts proved to be unsuccessful. It became evident that although refactoring may appear beneficial initially, it actually obscures important details that the LLM is unable to comprehend. While the individual steps were described accurately, certain functions were already optimised, making it difficult to identify potential changes for improvement.

However, there were instances where a clear change could be identified, leading to a successful rewrite. For example, Sam Dotson provided a valuable tip regarding the utilisation of a hashmap for counts in the weighted_permutation_entropy function, resulting in an effective rewrite.

To address these challenges and properly evaluate the rewrites, the introduction of unittests and timeit.repeat for each function becomes crucial. Unittests help identify problematic rewrites by comparing the outputs of the original and refactored functions, ensuring correctness. Additionally, timeit.repeat measurements enable the assessment of the refactored functions’ efficiency, determining if any improvements have been achieved.

Successful refactoring

In the majority of cases, the suggested improvements made by LLM led to increased efficiency in the functions:

util_granulate_time_series demonstrated a speed improvement ranging from 1.2 to 2 times faster compared to the original implementation.
util_pattern_space exhibited a significant speed boost, ranging from 40 to 250 times faster, depending on the size of the input.
weighted_permutation_entropy achieved a speed improvement of approximately 3.5 to 5 times faster by utilising a hashmap for efficient counting.

These improvements highlight the value of the refactoring process and the effectiveness of the suggestions provided by LLM in enhancing the efficiency of the functions.

Lessons

Through the process of refactoring functions using LLMs, several valuable lessons were learned:

Utilise Chain of Thought approach: Instead of directly asking for a new improved version of a function, it is beneficial to break down the process into three steps:
Ask the LLM to review and describe the functionality of the function.
Request potential improvements from the LLM.
Seek improvements that are sensible and align with the context of the function.
Complex functions require additional guidance: LLMs may struggle with understanding and improving complex functions that involve intricate implementation details. However, with proper guidance and specific suggestions, even complex functions can benefit from refactoring using LLMs.
Employ both unit tests and property-based testing: In addition to traditional unit tests, the use of property-based testing can be highly beneficial. Libraries like hypothesis can generate diverse inputs to thoroughly test the functionality and correctness of the refactored functions.

By applying these lessons, the process of refactoring functions with the assistance of LLMs can be more effective, resulting in improved code quality, efficiency, and maintainability.

Results

You can find the pull request and track the progress of the proposed improvements here: GitHub PR to pyEntropy