How to decrease model inference time

Hi Team,

I am trying to use this for my application logs, by tweaking a bit of security-prompt.

Few Observations/query that I have:

I tried switching to 32B the performance was even slower
It takes a quite of lot time to run analysis, I used 100 chunks, and for this it took almost 5 mins and if we inc the file size to 5k the analysis will take almost 10 min
Is there a way to increase the inference ? so that analysis can be run more quickly
also I am running this on A100, 4gpu - any other machine configuration for which it will be more faster and accurate?
Any other suggestion that u guys have to make it work in more faster and optimized way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to decrease model inference time #22

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

How to decrease model inference time #22

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions