Hello, i'm Windows-Server-Admin and not a network-technican and even not a WireShark-User. I'm analysing a performance-problem of an application. The application reads and writes to a small config-file (ini) on a Network-Share. If the mentioned config-file is placed on our 3rd-Party-Storage the application-performance is bad. If the config-file is placed on a normal Microsoft Windows Share the application-performance is good. I am trying to analyse and compare the szenarios with WireShark and I see big differences. Please find below a graphical comparison: Link to full comparison-picture: http://www.fotos-hochladen.net/uploads/wiresharkcqgx07prlf.png The comparison shows the same process. (Application-Start) The traffic looks like this when the config-file is on the 3rd-party-storage the whole time. Link to full picture: http://www.fotos-hochladen.net/uploads/packetsyq1d5j0s96.png When I "follow the TCP-Stream" in WireShark, I can see the content of the file xxxx times. Does anybody has an idea on this? I can see, that something is wrong, but I don't know how to analyse this further ... Thank you! asked 23 Jan '17, 03:18 Panteraa edited 23 Jan '17, 03:22 |
One Answer:
Hi Panteraa I assume that your screenshot shows the connection of a single workstation to the server. A good starting point for the analysis is the function Statistics -> Service Response Times -> SMB2 (or SMB, if your filer does not support SMB2). The screenshot shows that the client is running in a tight loop and constantly repeats three operations
The time stamps from the second picture document delta times of a few milliseconds between the loop iterations. Although your question mentions write activities, they are not visible in the trace. Tuning is not too hard:
Ok, you can't cache. It must just another stupid application. Here are a few more options:
There are a number of reasons, why the 3rd-party storage server could be slower than the Windows server. Among the possible reasons are:
More details would require a closer look at trace files from both systems, the network architecture and the configuration of the two servers. Good Hunting answered 23 Jan '17, 12:40 packethunter showing 5 of 7 show 2 more comments |
Hi packethunter,
thank you for the effort and the extensive Information.
The filer supports SMB2. The stats look like this:
http://www.fotos-hochladen.net/uploads/stats7eckw51vqr.png
That's my target, but I don't think the application is the cause, because when the configuration-file is placed on a Windows-Share it doesn't show the open-read-close loop behaviour.
.
I will have a closer look on your other suggestions, but I think I will need to get some paid consulting on this.
Thank you
One last thing to try before you fill out that purchase request:
The most likely candidat for your experience is the branch cache. This feature is described by multiple articles on MSDN. Here is a good introduction: https://technet.microsoft.com/en-us/library/dd637832(v=ws.10).aspx
Please compare the TreeConnect Responses from the storage server and the Windows system. There might be a few tiny differences, including individual bits which Wireshark may not (yet) interpret correctly.
The cache has to be enabled on a per-share base.
Good luck
Hi packethunter,
I know Branch-Cache. I came into contact with it in a course for a Microsoft exam. We don't use it in our Company at all. The service ist not running and the role is not activated on the client. Would you say, that there is a setting on the storage that we could try to control the behaviour?
I have prepared a comparison between the TreeConnect of the two szenarios ...
Tree Connect Request: http://www.fotos-hochladen.net/uploads/01treeconnecq7wamg48io.png
Tree Connect Response: http://www.fotos-hochladen.net/uploads/02treeconnecth839ialsf1.png
Are you able to see anything suspicious?
I noticed, that the client requests 71 Credits from the 3rd-Party-Storage. The 3rd-Party-Storage grants 1 Credit.
In the other szenario the clients requests 95 Credits from the Windows-Server. The Windows-Server grants 33 Credits. Could this have anything to do with the issue?
Thank you!
There are only two extra bits set by the 3rd-party storage. I'm not aware, that any of the two bits would cause some client side caching.
It is noteworthy, that the Windows server is a lot faster (680 microseconds vs. 1400). The delay might be caused by the two extra hops between client and 3rd-party storage, at least partially (IP TTL 127 vs. 125 indicates 1 vs. 3 hops --- or a rather strange IP stack).
The Credits would be a limiting factor if the client issues either multiple requests or requests dealing with more than 64 kByte.
If you are interested in a quick way to visualize credits you might want to check my blog posting on packet-foo.com: https://blog.packet-foo.com/2016/10/trace-file-case-files-smb2-performance/
Please note, that the blog post discusses a different problem.
I am afraid, that I can not help you further without a full trace starting with the SMB protocol negotiation.
Good luck
As @packethunter says, it's diffcult to analyse this type of problem without visibility of the trace. I did have one thought regarding the difference between the two devices, in particular regarding the amount of data moved across the network.
SMB2 has a mechanism called Leasing (in SMB1 it was called OpLock). A workstation can request to lease the data it gets from a file server. Leasing means that a workstation can hold file data in a memory buffer where an application can perform repeated operations on it before eventually flushing it to the file server. This improves application performance and reduces network load.
The Lease is requested when the file is opened (SMB2 Create Request) and granted in the Create Response. Wireshark still labels the fields involved as Oplock.
Try comparing the Oplock values in the Create Req and Rsp on the Windows share with that on the NAS.
If you want to analyze the SMB2 performance in detail, check out the TRANSUM plugin for Wireshark at https://community.tribelab.com/mod/page/view.php?id=492
TRANSUM gives you a breakdown of the response time of each SMB2 command, splitting it into server time and network time.
Best regards...Paul
Hi,
I think I would be able to sanitize the trace-files and upload it, but I don't want you to make a time-consuming analysis of my data for free.
Nice. I have visualized the credits in the trace:
https://picload.org/image/roawpgaa/iograph.png (https://img3.picload.org/image/roawcowp/iograph.png)
I also thought about a caching-issue. I find it strange, because I did also try to disable OpLocks on the Windows-Server and made a test again, but it was fast too ...!?
I have compared the values. I noticed, that the client requests a "SMB2_CREATE_DURABLE_HANDLE_REQUEST" (DHnQ) in both szenarios. The Windows-Server contains the "DHnQ" in the Response, but the 3rd-party-storage not:
https://picload.org/image/roawclgg/requestresponse.png
All in all your answers were all helpfull. I will mark them, so you get reputation. I think I have collected enough material to give the case to the storage-team/manufacturer.
.
At Packethunter:
Erlauben Sie mir Sie unter der angegebenen E-Mail auf Ihrem Blog betreffend eines Angebots für eine Analyse in einer anderen Sache zu kontaktieren? Machen Sie so etwas? (mit Rechnung usw.)
Thank you
Sounds like a good time to engage with the storage manufacturer.
If you want quick start to SMB analysis you might want to take a look at the SMB2 Overview page on TribeLab - https://community.tribelab.com/mod/page/view.php?id=608